home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
TeX 1995 July
/
TeX CD-ROM July 1995 (Disc 1)(Walnut Creek)(1995).ISO
/
dviware
/
dvicopy
/
dvicopy.web
(
.txt
)
< prev
next >
Wrap
Texinfo Document
|
1990-10-01
|
195KB
|
4,064 lines
% This is DVICOPY.WEB in text format, as of August 6, 1990.
% This program by P. Breitenlohner is not copyrighted and can be used freely.
% Version 0.9 was finished May 21, 1990.
% Version 0.91 fixed several bugs (May 22, 1990).
% Version 0.92 introduced statistics (May 25, 1990).
% Version 0.95 modified preamble comment, "history" (July 23, 1990).
% Version 1.0 pixel rounding for real devices (August 6, 1990).
% Here is TeX material that gets inserted after \input webmac
\def\hang{\hangindent 3em\indent\ignorespaces}
\font\ninerm=cmr9
\let\mc=\ninerm % medium caps for names like SAIL
\def\PASCAL{Pascal}
\mathchardef\RA="3221 % right arrow
\def\(#1){} % this is used to make section names sort themselves better
\def\9#1{} % this is used for sort keys in the index
\def\title{DVI\lowercase{copy}}
\def\contentspagenumber{1}
\def\topofcontents{\null
\def\titlepage{F} % include headline on the contents page
\def\rheader{\mainfont\hfil \contentspagenumber}
\vfill
\centerline{\titlefont The {\ttitlefont DVIcopy} processor}
\vskip 15pt
\centerline{(Version 1.0, August 1990)}
\vfill}
\def\botofcontents{\vfill
\centerline{\hsize 5in\baselineskip9pt
\vbox{\ninerm\noindent
This program was developed at the
Max-Planck-Instiut f\"ur Physik
(Werner-Heisenberg-Institut), Munich, Germany.
`\TeX' is a trademark of the American Mathematical Society.}}}
\pageno=\contentspagenumber \advance\pageno by 1
@* Introduction.
The \.{DVIcopy} utility program copies (selected pages of) binary
device-independent (``\.{DVI}'') files that are produced by document
compilers such as \TeX, and replaces all references to characters from
virtual fonts by the typesetting instructions specified for them in
binary virtual-font (``\.{VF}'') files.
This program has two chief purposes: (1)~It can be used as preprocessor
for existing \.{DVI}-related software in cases where this software is
unable to handle virtual fonts or (given suitable \.{VF} files) where
this software cannot handle fonts with more than 128~characters;
and (2)~it serves as an example of a program that reads \.{DVI} and
\.{VF} files correctly, for system programmers who are developing
\.{DVI}-related software.
Goal number (1) is important since quite a few existing programs have
to be adapted to the extened capabilities of Version~3 of \TeX\ which
will require some time. Moreover some existing programs are `as is' and
the source code is, unfortunately, not available.
Goal number (2) needs perhaps a bit more explanation. Programs for
typesetting need to be especially careful about how they do arithmetic; if
rounding errors accumulate, margins won't be straight, vertical rules
won't line up, and so on (see the documentaion of \.{DVItype} for more
details). This program is written as if it were a \.{DVI}-driver for a
hypothetical typesetting device |out_file|, the output file receiving
the copy of the input |dvi_file|. In addition all code related to
|out_file| is concentrated in a few chapters of this program and quite
independent of the rest of the code concerned with the decoding of
\.{DVI} and \.{VF} files and with font substitutions. Thus it should be
relatively easy to replace the device dependent code of this program by
the corresponding code required for a real typesetting device.
Having this in mind \.{DVItype}'s pixel rounding algorithms are included
as conditional code not used by \.{DVIcopy}.
The |banner| and |preamble_comment| strings defined here should be
changed whenever \.{DVIcopy} gets modified.
@d banner=='This is DVIcopy, Version 1.0' {printed when the program starts}
@d preamble_comment=='DVIcopy 1.0 output from '
@d comm_length=24 {length of |preamble_comment|}
@d from_length=6 {length of its |' from '| part}
@ This program is written in standard \PASCAL, except where it is necessary
to use extensions; for example, \.{DVIcopy} must read files whose names
are dynamically specified, and that would be impossible in pure \PASCAL.
All places where nonstandard constructions are used have been listed in
the index under ``system dependencies.''
@!@^system dependencies@>
One of the extensions to standard \PASCAL\ that we shall deal with is the
ability to move to a random place in a binary file; another is to
determine the length of a binary file. Such extensions are not necessary
for reading \.{DVI} files; since \.{DVIcopy} is (a model for) a
production program it should, however, be made as efficient as possible
for a particular system. If \.{DVIcopy} is being used with
\PASCAL s for which random file positioning is not efficiently available,
the following definition should be changed from |true| to |false|; in such
cases, \.{DVIcopy} will not include the optional feature that reads the
postamble first.
@d random_reading==true {should we skip around in the file?}
@ The program begins with a fairly normal header, made up of pieces that
@^system dependencies@>
will mostly be filled in later. The \.{DVI} input comes from file
|dvi_file|, the \.{DVI} output goes to file |out_file|, and messages
go to \PASCAL's standard |output| file.
The \.{TFM} and \.{VF} files are defined later since their external
names are determined dynamically.
If it is necessary to abort the job because of a fatal error, the program
calls the `|jump_out|' procedure, which goes to the label |final_end|.
@d final_end = 9999 {go here to wrap it up}
@p @t\4@>@<Compiler directives@>@/
program DVI_copy(@!dvi_file,@!out_file,@!output);
label final_end;
const @<Constants in the outer block@>@/
type @<Types in the outer block@>@/
var @<Globals in the outer block@>@/
@<Error handling procedures@>@/
procedure initialize; {this procedure gets things started properly}
var @<Local variables for initialization@>@/
begin print_ln(banner);@/
@<Set initial values@>@/
end;
@ On some systems it is necessary to use various integer subrange types
in order to make \.{\title} efficient; this is true in particular for
frequently used variables such as loop indices. Consider an integer
variable |x| with values in the range |0..255|: on most small systems
|x| should be a one or two byte integer whereas on most large systems
|x| should be a four byte integer.
Clearly the author of a program knows best which range of values is
required for each variable; thus \.{\title} never uses \PASCAL's |integer|
type. All integer variables are declared as one of the integer subrange
types defined below as \.{WEB} macros or \PASCAL\ types; these definitions
can be used without system-dependent changes, provided the signed 32~bit
integers are a subset of the standard type |integer|, and the compiler
automatically uses the optimal representation for integer subranges
(both conditions need not be satisfied for a particular system).
@^system dependencies@>
The complementary problem of storing large arrays of integer type
variables as compactly as possible is addressed differently; here
\.{\title} uses a \PASCAL\ |type|~declaration for each kind of array
element.
Note that the primary purpose of these definitions is optimizations, not
range checking. All places where optimization for a particular system is
highly desirable have been listed in the index under ``optimization.''
@!@^optimization@>
@d int_32 == integer {signed 32~bit integers}
@<Types...@>=
@!int_31 = 0..@"7FFFFFFF; {unsigned 31~bit integer}
@!int_24u = 0..@"FFFFFF; {unsigned 24~bit integer}
@!int_24 = -@"800000..@"7FFFFF; {signed 24~bit integer}
@!int_23 = 0..@"7FFFFF; {unsigned 23~bit integer}
@!int_16u = 0..@"FFFF; {unsigned 16~bit integer}
@!int_16 = -@"8000..@"7FFF; {signed 16~bit integer}
@!int_15 = 0..@"7FFF; {unsigned 15~bit integer}
@!int_8u = 0..@"FF; {unsigned 8~bit integer}
@!int_8 = -@"80..@"7F; {signed 8~bit integer}
@!int_7 = 0..@"7F; {unsigned 7~bit integer}
@ Some of this code is optional for use when debugging only;
such material is enclosed between the delimiters |debug| and $|gubed|$.
Other parts, delimited by |stat| and $|tats|$, are optionally included
if statistics about \.{\title}'s memory usage are desired.
@d debug==@{ {change this to `$\\{debug}\equiv\null$' when debugging}
@d gubed==@t@>@} {change this to `$\\{gubed}\equiv\null$' when debugging}
@f debug==begin
@f gubed==end
@d stat==@{ {change this to `$\\{stat}\equiv\null$'
when gathering usage statistics}
@d tats==@t@>@} {change this to `$\\{tats}\equiv\null$'
when gathering usage statistics}
@f stat==begin
@f tats==end
@ As mentioned above, \.{DVIcopy} has two chief purposes: (1)~It produces
a copy of the input \.{DVI} file with all references to characters from
virtual fonts replaced by their expansion as specified in the character
packets of \.{VF} files; and (2)~it serves as an example of a program
that reads \.{DVI} and \.{VF} files correctly, for system programmers
who are developing \.{DVI}-related software.
Parts of the program that are needed in (2) but not in (1) are delimited
by the codewords `$|device|\ldots|ecived|$'; these are mostly the pixel
rounding algorithms used to convert the \.{DVI} units of a \.{DVI} file
to the raster units of a real output device and have been copied more or
less verbatim from \.{DVItype}.
@d device==@{ {change this to `$\\{device}\equiv\null$' when output
for a real device is produced}
@d ecived==@t@>@} {change this to `$\\{ecived}\equiv\null$' when output
for a real device is produced}
@f device==begin
@f ecived==end
@ The \PASCAL\ compiler used to develop this program has ``compiler
directives'' that can appear in comments whose first character is a dollar sign.
In production versions of \.{\title} these directives tell the compiler that
@^system dependencies@>
it is safe to avoid range checks and to leave out the extra code it inserts
for the \PASCAL\ debugger's benefit, although interrupts will occur if
there is arithmetic overflow.
@<Compiler directives@>=
@{@&$C-,A+,D-@} {no range check, catch arithmetic overflow, no debug overhead}
@!debug @{@&$C+,D+@}@+ gubed {but turn everything on when debugging}
@ Labels are given symbolic names by the following definitions. We insert
the label `|exit|:' just before the `\ignorespaces|end|\unskip' of a
procedure in which we have used the `|return|' statement defined below;
the label `|restart|' is occasionally used at the very beginning of a
procedure; and the label `|reswitch|' is occasionally used just prior to
a \&{case} statement in which some cases change the conditions and we wish to
branch to the newly applicable case.
Loops that are set up with the \&{loop} construction defined below are
commonly exited by going to `|done|' or to `|found|' or to `|not_found|',
and they are sometimes repeated by going to `|continue|'.
@d exit=10 {go here to leave a procedure}
@d restart=20 {go here to start a procedure again}
@d reswitch=21 {go here to start a case statement again}
@d continue=22 {go here to resume a loop}
@d done=30 {go here to exit a loop}
@d found=31 {go here when you've found it}
@d not_found=32 {go here when you've found something else}
@ The term |print| is used instead of |write| when this program writes on
|output|, so that all such output could easily be redirected if desired;
the term |d_print| is used for conditional output if we are debugging.
@d print(#)==write(output,#)
@d print_ln(#)==write_ln(output,#)
@d d_print(#)==@!debug print(#) @; @+ gubed
@d d_print_ln(#)==@! debug print_ln(#) @; @+ gubed
@ Here are some macros for common programming idioms.
@d incr(#) == #:=#+1 {increase a variable by unity}
@d decr(#) == #:=#-1 {decrease a variable by unity}
@d Incr_Decr(#) == #
@d Incr(#) == #:=#+Incr_Decr {increase a variable}
@d Decr(#) == #:=#-Incr_Decr {decrease a variable}
@d loop == @+ while true do@+ {repeat over and over until a |goto| happens}
@d do_nothing == {empty statement}
@d return == goto exit {terminate a procedure call}
@f return == nil
@f loop == xclause
@ We assume that |case| statements may include a default case that applies
if no matching label is found. Thus, we shall use constructions like
@^system dependencies@>
$$\vbox{\halign{#\hfil\cr
|case x of|\cr
1: $\langle\,$code for $x=1\,\rangle$;\cr
3: $\langle\,$code for $x=3\,\rangle$;\cr
|othercases| $\langle\,$code for |x<>1| and |x<>3|$\,\rangle$\cr
|endcases|\cr}}$$
since most \PASCAL\ compilers have plugged this hole in the language by
incorporating some sort of default mechanism. For example, the compiler
used to develop \.{WEB} and \TeX\ allows `|others|:' as a default label,
and other \PASCAL s allow syntaxes like `\ignorespaces|else|\unskip' or
`\&{otherwise}' or `\\{otherwise}:', etc. The definitions of |othercases|
and |endcases| should be changed to agree with local conventions. (Of
course, if no default mechanism is available, the |case| statements of
this program must be extended by listing all remaining cases.
Donald~E. Knuth, the author of the \.{WEB} system program \.{TANGLE},
@^Knuth, Donald Ervin@>
would have taken the trouble to modify \.{TANGLE} so that such extensions
were done automatically, if he had not wanted to encourage \PASCAL\
compiler writers to make this important change in \PASCAL, where it belongs.)
@d othercases == others: {default for cases not listed explicitly}
@d endcases == @+end {follows the default case in an extended |case| statement}
@f othercases == else
@f endcases == end
@ The definition of |max_font_type| should be adapted to the number of
font types used by the program; the first two values have a fixed meaning:
|new_font_type=0| indicates that a font has been defined but has
not yet been used, and |vf_font_type=1| indicates a virtual font;
font type values |>=2| indicate real fonts and different font types
could be used to distinguish various kinds of font files (\.{GF} or
\.{PK} or \.{PXL}).
@!@^font types@>
@d new_font_type=0 {this font has been defined but has not yet been used}
@d vf_font_type=1 {this font is a virtual font}
@d out_font_type=2 {this font has been used in |out_file|}
@d max_font_type=2
@ The following parameters can be changed at compile time to extend or
reduce \.{DVIcopy}'s capacity.
@<Constants...@>=
@!max_fonts=100; {maximum number of distinct fonts}
@!max_chars=10000; {maximum number of different characters among all fonts}
@!max_widths=3000; {maximum number of different characters widths}
@!max_packets=5000; {maximum number of different characters packets;
must be less than 65536}
@!max_bytes=30000; {maximum number of bytes for characters packets}
@!max_recursion=10; {\.{VF} files shouldn't recurse beyond this level}
@!stack_size=100; {\.{DVI} files shouldn't |push| beyond this depth}
@!name_length=50; {a file name shouldn't be longer than this}
@ A global variable called |history| will contain one of four values
at the end of every run: |spotless| means that no unusual messages were
printed; |harmless_message| means that a message of possible interest
was printed but no serious errors were detected; |error_message| means that
at least one error was found; |fatal_message| means that the program
terminated abnormally. The value of |history| does not influence the
behavior of the program; it is simply computed for the convenience
of systems that might want to use such information.
@d spotless=0 {|history| value for normal jobs}
@d harmless_message=1 {|history| value when non-serious info was printed}
@d error_message=2 {|history| value when an error was noted}
@d fatal_message=3 {|history| value when we had to stop prematurely}
@d mark_harmless==@t@>@+if history=spotless then history:=harmless_message
@d mark_error==history:=error_message
@d mark_fatal==history:=fatal_message
@<Glob...@>=@!history:spotless..fatal_message; {how bad was this run?}
@ @<Set init...@>=history:=spotless;
@* The character set.
Like all programs written with the \.{WEB} system, \.{\title} can be
used with any character set. But it uses ASCII code internally, because
the programming for portable input-output is easier when a fixed internal
code is used, and because \.{DVI} and \.{VF} files use ASCII code for
file names and certain other strings.
The next few sections of \.{\title} have therefore been copied from the
analogous ones in the \.{WEB} system routines. They have been considerably
simplified, since \.{\title} need not deal with the controversial
ASCII codes less than @'40 or greater than @'176.
If such codes appear in the \.{DVI} file,
they will be printed as question marks.
@<Types...@>=
@!ASCII_code=" ".."~"; {a subrange of the integers}
@ The original \PASCAL\ compiler was designed in the late 60s, when six-bit
character sets were common, so it did not make provision for lower case
letters. Nowadays, of course, we need to deal with both upper and lower case
alphabets in a convenient way, especially in a program like \.{\title}.
So we shall assume that the \PASCAL\ system being used for \.{\title}
has a character set containing at least the standard visible characters
of ASCII code (|"!"| through |"~"|).
Some \PASCAL\ compilers use the original name |char| for the data type
associated with the characters in text files, while other \PASCAL s
consider |char| to be a 64-element subrange of a larger data type that has
some other name. In order to accommodate this difference, we shall use
the name |text_char| to stand for the data type of the characters in the
output file. We shall also assume that |text_char| consists of
the elements |chr(first_text_char)| through |chr(last_text_char)|,
inclusive. The following definitions should be adjusted if necessary.
@^system dependencies@>
@d text_char == char {the data type of characters in text files}
@d first_text_char=0 {ordinal number of the smallest element of |text_char|}
@d last_text_char=127 {ordinal number of the largest element of |text_char|}
@<Types...@>=
@!text_file=packed file of text_char;
@ @<Local variables for init...@>=
@!i:int_16; {loop index for initializations}
@ The \.{\title} processor converts between ASCII code and
the user's external character set by means of arrays |xord| and |xchr|
that are analogous to \PASCAL's |ord| and |chr| functions.
@<Globals...@>=
@!xord: array [text_char] of ASCII_code;
{specifies conversion of input characters}
@!xchr: array [0..255] of text_char;
{specifies conversion of output characters}
@ Under our assumption that the visible characters of standard ASCII are
all present, the following assignment statements initialize the
|xchr| array properly, without needing any system-dependent changes.
@<Set init...@>=
for i:=0 to @'37 do xchr[i]:='?';
xchr[@'40]:=' ';
xchr[@'41]:='!';
xchr[@'42]:='"';
xchr[@'43]:='#';
xchr[@'44]:='$';
xchr[@'45]:='%';
xchr[@'46]:='&';
xchr[@'47]:='''';@/
xchr[@'50]:='(';
xchr[@'51]:=')';
xchr[@'52]:='*';
xchr[@'53]:='+';
xchr[@'54]:=',';
xchr[@'55]:='-';
xchr[@'56]:='.';
xchr[@'57]:='/';@/
xchr[@'60]:='0';
xchr[@'61]:='1';
xchr[@'62]:='2';
xchr[@'63]:='3';
xchr[@'64]:='4';
xchr[@'65]:='5';
xchr[@'66]:='6';
xchr[@'67]:='7';@/
xchr[@'70]:='8';
xchr[@'71]:='9';
xchr[@'72]:=':';
xchr[@'73]:=';';
xchr[@'74]:='<';
xchr[@'75]:='=';
xchr[@'76]:='>';
xchr[@'77]:='?';@/
xchr[@'100]:='@@';
xchr[@'101]:='A';
xchr[@'102]:='B';
xchr[@'103]:='C';
xchr[@'104]:='D';
xchr[@'105]:='E';
xchr[@'106]:='F';
xchr[@'107]:='G';@/
xchr[@'110]:='H';
xchr[@'111]:='I';
xchr[@'112]:='J';
xchr[@'113]:='K';
xchr[@'114]:='L';
xchr[@'115]:='M';
xchr[@'116]:='N';
xchr[@'117]:='O';@/
xchr[@'120]:='P';
xchr[@'121]:='Q';
xchr[@'122]:='R';
xchr[@'123]:='S';
xchr[@'124]:='T';
xchr[@'125]:='U';
xchr[@'126]:='V';
xchr[@'127]:='W';@/
xchr[@'130]:='X';
xchr[@'131]:='Y';
xchr[@'132]:='Z';
xchr[@'133]:='[';
xchr[@'134]:='\';
xchr[@'135]:=']';
xchr[@'136]:='^';
xchr[@'137]:='_';@/
xchr[@'140]:='`';
xchr[@'141]:='a';
xchr[@'142]:='b';
xchr[@'143]:='c';
xchr[@'144]:='d';
xchr[@'145]:='e';
xchr[@'146]:='f';
xchr[@'147]:='g';@/
xchr[@'150]:='h';
xchr[@'151]:='i';
xchr[@'152]:='j';
xchr[@'153]:='k';
xchr[@'154]:='l';
xchr[@'155]:='m';
xchr[@'156]:='n';
xchr[@'157]:='o';@/
xchr[@'160]:='p';
xchr[@'161]:='q';
xchr[@'162]:='r';
xchr[@'163]:='s';
xchr[@'164]:='t';
xchr[@'165]:='u';
xchr[@'166]:='v';
xchr[@'167]:='w';@/
xchr[@'170]:='x';
xchr[@'171]:='y';
xchr[@'172]:='z';
xchr[@'173]:='{';
xchr[@'174]:='|';
xchr[@'175]:='}';
xchr[@'176]:='~';
for i:=@'177 to 255 do xchr[i]:='?';
@ The following system-independent code makes the |xord| array contain a
suitable inverse to the information in |xchr|.
@<Set init...@>=
for i:=first_text_char to last_text_char do xord[chr(i)]:=@'40;
for i:=" " to "~" do xord[xchr[i]]:=i;
@* Reporting errors to the user.
The \.{\title} processor does not verify that every single bit read from
one of its binary input files is meaningful and consistent; there are
other programs, e.g., \.{DVItype}, \.{TFtoPL}, and \.{VFtoPL}, specially
designed for that purpose.
On the other hand, \.{\title} is designed to avoid unpredictable results
due to undetected arithmetic overflow, or due to violation of integer
subranges or array bounds under {\it all\/} circumstances. Thus a fair
amount of checking is done when reading and analyzing the input data,
even in cases where such checking reduces the efficiency of the program
to some extent.
The error recovery capabilities of \.{\title} are, at least for the
moment, extremely limited; everything worse than a warning message leads
to the immediate termination of the program.
@ If an input (\.{DVI}, \.{TFM}, \.{VF}, or other) file is badly malformed,
the whole process must be aborted; \.{\title} will give up, after issuing
an error message about what caused the error. These messages will, however,
in most cases just indicate which input file caused the error. One of the
programs \.{DVItype}, \.{TFtoPL} or \.{VFtoVP} should then be used to
diagnose the error in full detail.
Such errors might be discovered inside of subroutines inside of subroutines,
so a procedure called |jump_out| has been introduced. This procedure, which
transfers control to the label |final_end| at the end of the program,
contains the only non-local |@!goto| statement in \.{DVIcopy}.
@^system dependencies@>
Some \PASCAL\ compilers do not implement non-local |goto| statements. In
such cases the |goto final_end| in |jump_out| should simply be replaced
by a call on some system procedure that quietly terminates the program.
@^system dependencies@>
@d abort(#)==begin print_ln(' ',#,'.'); jump_out;
end
@<Error handling...@>=
@<Basic printing procedures@>@;
procedure close_files_and_terminate; forward;
procedure jump_out;
begin mark_fatal; close_files_and_terminate;
goto final_end;
@ Sometimes the program's behavior is far different from what it should
be, and \.{\title} prints an error message that is really for the
\.{\title} maintenance person, not the user. In such cases the program
says |confusion(|indication of where we are|)|.
@<Error handling...@>=
procedure confusion(@!p:pckt_pointer);
begin print(' !This can''t happen ('); print_packet(p); print_ln(').');
@.This can't happen@>
jump_out;
@ An overflow stop occurs if \.{\title}'s tables aren't large enough.
@<Error handling...@>=
procedure overflow(@!p:pckt_pointer;@!n:int_16u);
begin print(' !Sorry, DVIcopy capacity exceeded ['); print_packet(p);
@.Sorry, DVIcopy capacity exceeded@>
print_ln('=',n:1,'].');
jump_out;
@ If an attempt is made to store a second character packet for
extension~|ext|, we give a warning message.
@p procedure dup_warning(@!ext:int_24);
begin if d_warn_count<10 then {stop telling after first 10 times}
begin print_ln('---duplicate character packet for extension ',ext:1);
@.duplicate character packet@>
incr(d_warn_count); mark_harmless;
if d_warn_count=10 then print_ln('---further messages suppressed.');
end;
@ If there are no character packets (with any extension) for character
residue~|cur_res| and font~|cur_fnt|, we give a warning message.
@p procedure pckt_warning;
begin if p_warn_count<10 then {stop telling after first 10 times}
begin print_ln('---missing character packet character ',cur_res:1,
' from font ',cur_fnt:1);
@.missing character packet@>
incr(p_warn_count); mark_error;
if p_warn_count=10 then print_ln('---further messages suppressed.');
end;
@ If a character packet for extension~|e| is used instead of one for
extension~|ext| (which could not be found), we give a warning message.
@p procedure subst_warning(@!e,@!ext:int_24);
begin if s_warn_count<10 then {stop telling after first 10 times}
begin print_ln('---substituted character packet for extension ',
e:1,' instead of ',ext:1);
@.substituted character packet@>
incr(s_warn_count); mark_error;
if s_warn_count=10 then print_ln('---further messages suppressed.');
end;
@ @<Glob...@>=
@!d_warn_count:int_7; {counts |dup_warning| messages}
@!p_warn_count:int_7; {counts |pckt_warning| messages}
@!s_warn_count:int_7; {counts |subst_warning| messages}
@ @<Set init...@>=
d_warn_count:=0; p_warn_count:=0; s_warn_count:=0;
@* Device-independent file format.
Before we get into the details of \.{\title}, we need to know exactly
what \.{DVI} files are. The form of such files was designed by David R.
@^Fuchs, David Raymond@>
Fuchs in 1979. Almost any reasonable typesetting device can be driven by
a program that takes \.{DVI} files as input, and dozens of such
\.{DVI}-to-whatever programs have been written. Thus, it is possible to
print the output of document compilers like \TeX\ on many different kinds
of equipment.
A \.{DVI} file is a stream of 8-bit bytes, which may be regarded as a
series of commands in a machine-like language. The first byte of each command
is the operation code, and this code is followed by zero or more bytes
that provide parameters to the command. The parameters themselves may consist
of several consecutive bytes; for example, the `|set_rule|' command has two
parameters, each of which is four bytes long. Parameters are usually
regarded as nonnegative integers; but four-byte-long parameters,
and shorter parameters that denote distances, can be
either positive or negative. Such parameters are given in two's complement
notation. For example, a two-byte-long distance parameter has a value between
$-2^{15}$ and $2^{15}-1$.
@.DVI {\rm files}@>
A \.{DVI} file consists of a ``preamble,'' followed by a sequence of one
or more ``pages,'' followed by a ``postamble.'' The preamble is simply a
|pre| command, with its parameters that define the dimensions used in the
file; this must come first. Each ``page'' consists of a |bop| command,
followed by any number of other commands that tell where characters are to
be placed on a physical page, followed by an |eop| command. The pages
appear in the order that they were generated, not in any particular
numerical order. If we ignore |nop| commands and \\{fnt\_def} commands
(which are allowed between any two commands in the file), each |eop|
command is immediately followed by a |bop| command, or by a |post|
command; in the latter case, there are no more pages in the file, and the
remaining bytes form the postamble. Further details about the postamble
will be explained later.
Some parameters in \.{DVI} commands are ``pointers.'' These are four-byte
quantities that give the location number of some other byte in the file;
the first byte is number~0, then comes number~1, and so on. For example,
one of the parameters of a |bop| command points to the previous |bop|;
this makes it feasible to read the pages in backwards order, in case the
results are being directed to a device that stacks its output face up.
Suppose the preamble of a \.{DVI} file occupies bytes 0 to 99. Now if the
first page occupies bytes 100 to 999, say, and if the second
page occupies bytes 1000 to 1999, then the |bop| that starts in byte 1000
points to 100 and the |bop| that starts in byte 2000 points to 1000. (The
very first |bop|, i.e., the one that starts in byte 100, has a pointer of $-1$.)
@ The \.{DVI} format is intended to be both compact and easily interpreted
by a machine. Compactness is achieved by making most of the information
implicit instead of explicit. When a \.{DVI}-reading program reads the
commands for a page, it keeps track of several quantities: (a)~The current
font |f| is an integer; this value is changed only
by \\{fnt} and \\{fnt\_num} commands. (b)~The current position on the page
is given by two numbers called the horizontal and vertical coordinates,
|h| and |v|. Both coordinates are zero at the upper left corner of the page;
moving to the right corresponds to increasing the horizontal coordinate, and
moving down corresponds to increasing the vertical coordinate. Thus, the
coordinates are essentially Cartesian, except that vertical directions are
flipped; the Cartesian version of |(h,v)| would be |(h,-v)|. (c)~The
current spacing amounts are given by four numbers |w|, |x|, |y|, and |z|,
where |w| and~|x| are used for horizontal spacing and where |y| and~|z|
are used for vertical spacing. (d)~There is a stack containing
|(h,v,w,x,y,z)| values; the \.{DVI} commands |push| and |pop| are used to
change the current level of operation. Note that the current font~|f| is
not pushed and popped; the stack contains only information about
positioning.
The values of |h|, |v|, |w|, |x|, |y|, and |z| are signed integers having up
to 32 bits, including the sign. Since they represent physical distances,
there is a small unit of measurement such that increasing |h| by~1 means
moving a certain tiny distance to the right. The actual unit of
measurement is variable, as explained below.
@ Here is a list of all the commands that may appear in a \.{DVI} file. Each
command is specified by its symbolic name (e.g., |bop|), its opcode byte
(e.g., 139), and its parameters (if any). The parameters are followed
by a bracketed number telling how many bytes they occupy; for example,
`|p[4]|' means that parameter |p| is four bytes long.
\yskip\hang|set_char_0| 0. Typeset character number~0 from font~|f|
such that the reference point of the character is at |(h,v)|. Then
increase |h| by the width of that character. Note that a character may
have zero or negative width, so one cannot be sure that |h| will advance
after this command; but |h| usually does increase.
\yskip\hang|set_char_1| through |set_char_127| (opcodes 1 to 127).
Do the operations of |set_char_0|; but use the character whose number
matches the opcode, instead of character~0.
\yskip\hang|set1| 128 |c[1]|. Same as |set_char_0|, except that character
number~|c| is typeset. \TeX82 uses this command for characters in the
range |128<=c<256|.
\yskip\hang|set2| 129 |c[2]|. Same as |set1|, except that |c|~is two
bytes long, so it is in the range |0<=c<65536|. \TeX82 never uses this
command, which is intended for processors that deal with oriental languages;
but \.{\title} will allow character codes greater than 255, assuming that
they all have the same width as the character whose code is $c \bmod 256$.
@^oriental characters@>@^Chinese characters@>@^Japanese characters@>
\yskip\hang|set3| 130 |c[3]|. Same as |set1|, except that |c|~is three
bytes long, so it can be as large as $2^{24}-1$.
\yskip\hang|set4| 131 |c[4]|. Same as |set1|, except that |c|~is four
bytes long, possibly even negative. Imagine that.
\yskip\hang|set_rule| 132 |a[4]| |b[4]|. Typeset a solid black rectangle
of height |a| and width |b|, with its bottom left corner at |(h,v)|. Then
set |h:=h+b|. If either |a<=0| or |b<=0|, nothing should be typeset. Note
that if |b<0|, the value of |h| will decrease even though nothing else happens.
Programs that typeset from \.{DVI} files should be careful to make the rules
line up carefully with digitized characters, as explained in connection with
the |rule_pixels| subroutine below.
\yskip\hang|put1| 133 |c[1]|. Typeset character number~|c| from font~|f|
such that the reference point of the character is at |(h,v)|. (The `put'
commands are exactly like the `set' commands, except that they simply put out a
character or a rule without moving the reference point afterwards.)
\yskip\hang|put2| 134 |c[2]|. Same as |set2|, except that |h| is not changed.
\yskip\hang|put3| 135 |c[3]|. Same as |set3|, except that |h| is not changed.
\yskip\hang|put4| 136 |c[4]|. Same as |set4|, except that |h| is not changed.
\yskip\hang|put_rule| 137 |a[4]| |b[4]|. Same as |set_rule|, except that
|h| is not changed.
\yskip\hang|nop| 138. No operation, do nothing. Any number of |nop|'s
may occur between \.{DVI} commands, but a |nop| cannot be inserted between
a command and its parameters or between two parameters.
\yskip\hang|bop| 139 $c_0[4]$ $c_1[4]$ $\ldots$ $c_9[4]$ $p[4]$. Beginning
of a page: Set |(h,v,w,x,y,z):=(0,0,0,0,0,0)| and set the stack empty. Set
the current font |f| to an undefined value. The ten $c_i$ parameters can
be used to identify pages, if a user wants to print only part of a \.{DVI}
file; \TeX82 gives them the values of \.{\\count0} $\ldots$ \.{\\count9}
at the time \.{\\shipout} was invoked for this page. The parameter |p|
points to the previous |bop| command in the file, where the first |bop|
has $p=-1$.
\yskip\hang|eop| 140. End of page: Print what you have read since the
previous |bop|. At this point the stack should be empty. (The \.{DVI}-reading
programs that drive most output devices will have kept a buffer of the
material that appears on the page that has just ended. This material is
largely, but not entirely, in order by |v| coordinate and (for fixed |v|) by
|h|~coordinate; so it usually needs to be sorted into some order that is
appropriate for the device in question. \.{\title} does not do such sorting.)
\yskip\hang|push| 141. Push the current values of |(h,v,w,x,y,z)| onto the
top of the stack; do not change any of these values. Note that |f| is
not pushed.
\yskip\hang|pop| 142. Pop the top six values off of the stack and assign
them to |(h,v,w,x,y,z)|. The number of pops should never exceed the number
of pushes, since it would be highly embarrassing if the stack were empty
at the time of a |pop| command.
\yskip\hang|right1| 143 |b[1]|. Set |h:=h+b|, i.e., move right |b| units.
The parameter is a signed number in two's complement notation, |-128<=b<128|;
if |b<0|, the reference point actually moves left.
\yskip\hang|right2| 144 |b[2]|. Same as |right1|, except that |b| is a
two-byte quantity in the range |-32768<=b<32768|.
\yskip\hang|right3| 145 |b[3]|. Same as |right1|, except that |b| is a
three-byte quantity in the range |@t$-2^{23}$@><=b<@t$2^{23}$@>|.
\yskip\hang|right4| 146 |b[4]|. Same as |right1|, except that |b| is a
four-byte quantity in the range |@t$-2^{31}$@><=b<@t$2^{31}$@>|.
\yskip\hang|w0| 147. Set |h:=h+w|; i.e., move right |w| units. With luck,
this parameterless command will usually suffice, because the same kind of motion
will occur several times in succession; the following commands explain how
|w| gets particular values.
\yskip\hang|w1| 148 |b[1]|. Set |w:=b| and |h:=h+b|. The value of |b| is a
signed quantity in two's complement notation, |-128<=b<128|. This command
changes the current |w|~spacing and moves right by |b|.
\yskip\hang|w2| 149 |b[2]|. Same as |w1|, but |b| is a two-byte-long
parameter, |-32768<=b<32768|.
\yskip\hang|w3| 150 |b[3]|. Same as |w1|, but |b| is a three-byte-long
parameter, |@t$-2^{23}$@><=b<@t$2^{23}$@>|.
\yskip\hang|w4| 151 |b[4]|. Same as |w1|, but |b| is a four-byte-long
parameter, |@t$-2^{31}$@><=b<@t$2^{31}$@>|.
\yskip\hang|x0| 152. Set |h:=h+x|; i.e., move right |x| units. The `|x|'
commands are like the `|w|' commands except that they involve |x| instead
of |w|.
\yskip\hang|x1| 153 |b[1]|. Set |x:=b| and |h:=h+b|. The value of |b| is a
signed quantity in two's complement notation, |-128<=b<128|. This command
changes the current |x|~spacing and moves right by |b|.
\yskip\hang|x2| 154 |b[2]|. Same as |x1|, but |b| is a two-byte-long
parameter, |-32768<=b<32768|.
\yskip\hang|x3| 155 |b[3]|. Same as |x1|, but |b| is a three-byte-long
parameter, |@t$-2^{23}$@><=b<@t$2^{23}$@>|.
\yskip\hang|x4| 156 |b[4]|. Same as |x1|, but |b| is a four-byte-long
parameter, |@t$-2^{31}$@><=b<@t$2^{31}$@>|.
\yskip\hang|down1| 157 |a[1]|. Set |v:=v+a|, i.e., move down |a| units.
The parameter is a signed number in two's complement notation, |-128<=a<128|;
if |a<0|, the reference point actually moves up.
\yskip\hang|down2| 158 |a[2]|. Same as |down1|, except that |a| is a
two-byte quantity in the range |-32768<=a<32768|.
\yskip\hang|down3| 159 |a[3]|. Same as |down1|, except that |a| is a
three-byte quantity in the range |@t$-2^{23}$@><=a<@t$2^{23}$@>|.
\yskip\hang|down4| 160 |a[4]|. Same as |down1|, except that |a| is a
four-byte quantity in the range |@t$-2^{31}$@><=a<@t$2^{31}$@>|.
\yskip\hang|y0| 161. Set |v:=v+y|; i.e., move down |y| units. With luck,
this parameterless command will usually suffice, because the same kind of motion
will occur several times in succession; the following commands explain how
|y| gets particular values.
\yskip\hang|y1| 162 |a[1]|. Set |y:=a| and |v:=v+a|. The value of |a| is a
signed quantity in two's complement notation, |-128<=a<128|. This command
changes the current |y|~spacing and moves down by |a|.
\yskip\hang|y2| 163 |a[2]|. Same as |y1|, but |a| is a two-byte-long
parameter, |-32768<=a<32768|.
\yskip\hang|y3| 164 |a[3]|. Same as |y1|, but |a| is a three-byte-long
parameter, |@t$-2^{23}$@><=a<@t$2^{23}$@>|.
\yskip\hang|y4| 165 |a[4]|. Same as |y1|, but |a| is a four-byte-long
parameter, |@t$-2^{31}$@><=a<@t$2^{31}$@>|.
\yskip\hang|z0| 166. Set |v:=v+z|; i.e., move down |z| units. The `|z|' commands
are like the `|y|' commands except that they involve |z| instead of |y|.
\yskip\hang|z1| 167 |a[1]|. Set |z:=a| and |v:=v+a|. The value of |a| is a
signed quantity in two's complement notation, |-128<=a<128|. This command
changes the current |z|~spacing and moves down by |a|.
\yskip\hang|z2| 168 |a[2]|. Same as |z1|, but |a| is a two-byte-long
parameter, |-32768<=a<32768|.
\yskip\hang|z3| 169 |a[3]|. Same as |z1|, but |a| is a three-byte-long
parameter, |@t$-2^{23}$@><=a<@t$2^{23}$@>|.
\yskip\hang|z4| 170 |a[4]|. Same as |z1|, but |a| is a four-byte-long
parameter, |@t$-2^{31}$@><=a<@t$2^{31}$@>|.
\yskip\hang|fnt_num_0| 171. Set |f:=0|. Font 0 must previously have been
defined by a \\{fnt\_def} instruction, as explained below.
\yskip\hang|fnt_num_1| through |fnt_num_63| (opcodes 172 to 234). Set
|f:=1|, \dots, |f:=63|, respectively.
\yskip\hang|fnt1| 235 |k[1]|. Set |f:=k|. \TeX82 uses this command for font
numbers in the range |64<=k<256|.
\yskip\hang|fnt2| 236 |k[2]|. Same as |fnt1|, except that |k|~is two
bytes long, so it is in the range |0<=k<65536|. \TeX82 never generates this
command, but large font numbers may prove useful for specifications of
color or texture, or they may be used for special fonts that have fixed
numbers in some external coding scheme.
\yskip\hang|fnt3| 237 |k[3]|. Same as |fnt1|, except that |k|~is three
bytes long, so it can be as large as $2^{24}-1$.
\yskip\hang|fnt4| 238 |k[4]|. Same as |fnt1|, except that |k|~is four
bytes long; this is for the really big font numbers (and for the negative ones).
\yskip\hang|xxx1| 239 |k[1]| |x[k]|. This command is undefined in
general; it functions as a $(k+2)$-byte |nop| unless special \.{DVI}-reading
programs are being used. \TeX82 generates |xxx1| when a short enough
\.{\\special} appears, setting |k| to the number of bytes being sent. It
is recommended that |x| be a string having the form of a keyword followed
by possible parameters relevant to that keyword.
\yskip\hang|xxx2| 240 |k[2]| |x[k]|. Like |xxx1|, but |0<=k<65536|.
\yskip\hang|xxx3| 241 |k[3]| |x[k]|. Like |xxx1|, but |0<=k<@t$2^{24}$@>|.
\yskip\hang|xxx4| 242 |k[4]| |x[k]|. Like |xxx1|, but |k| can be ridiculously
large. \TeX82 uses |xxx4| when |xxx1| would be incorrect.
\yskip\hang|fnt_def1| 243 |k[1]| |c[4]| |s[4]| |d[4]| |a[1]| |l[1]| |n[a+l]|.
Define font |k|, where |0<=k<256|; font definitions will be explained shortly.
\yskip\hang|fnt_def2| 244 |k[2]| |c[4]| |s[4]| |d[4]| |a[1]| |l[1]| |n[a+l]|.
Define font |k|, where |0<=k<65536|.
\yskip\hang|fnt_def3| 245 |k[3]| |c[4]| |s[4]| |d[4]| |a[1]| |l[1]| |n[a+l]|.
Define font |k|, where |0<=k<@t$2^{24}$@>|.
\yskip\hang|fnt_def4| 246 |k[4]| |c[4]| |s[4]| |d[4]| |a[1]| |l[1]| |n[a+l]|.
Define font |k|, where |@t$-2^{31}$@><=k<@t$2^{31}$@>|.
\yskip\hang|pre| 247 |i[1]| |num[4]| |den[4]| |mag[4]| |k[1]| |x[k]|.
Beginning of the preamble; this must come at the very beginning of the
file. Parameters |i|, |num|, |den|, |mag|, |k|, and |x| are explained below.
\yskip\hang|post| 248. Beginning of the postamble, see below.
\yskip\hang|post_post| 249. Ending of the postamble, see below.
\yskip\noindent Commands 250--255 are undefined at the present time.
@ @d set_char_0=0 {typeset character 0 and move right}
@d set1=128 {typeset a character and move right}
@d set_rule=132 {typeset a rule and move right}
@d put1=133 {typeset a character}
@d put_rule=137 {typeset a rule}
@d nop=138 {no operation}
@d bop=139 {beginning of page}
@d eop=140 {ending of page}
@d push=141 {save the current positions}
@d pop=142 {restore previous positions}
@d right1=143 {move right}
@d w0=147 {move right by |w|}
@d w1=148 {move right and set |w|}
@d x0=152 {move right by |x|}
@d x1=153 {move right and set |x|}
@d down1=157 {move down}
@d y0=161 {move down by |y|}
@d y1=162 {move down and set |y|}
@d z0=166 {move down by |z|}
@d z1=167 {move down and set |z|}
@d fnt_num_0=171 {set current font to 0}
@d fnt1=235 {set current font}
@d xxx1=239 {extension to \.{DVI} primitives}
@d xxx4=242 {potentially long extension to \.{DVI} primitives}
@d fnt_def1=243 {define the meaning of a font number}
@d pre=247 {preamble}
@d post=248 {postamble beginning}
@d post_post=249 {postamble ending}
@d undefined_commands==250,251,252,253,254,255
@ The preamble contains basic information about the file as a whole. As
stated above, there are six parameters:
$$\hbox{|@!i[1]| |@!num[4]| |@!den[4]| |@!mag[4]| |@!k[1]| |@!x[k]|.}$$
The |i| byte identifies \.{DVI} format; currently this byte is always set
to~2. (The value |i=3| is currently used for an extended format that
allows a mixture of right-to-left and left-to-right typesetting.
Some day we will set |i=4|, when \.{DVI} format makes another
incompatible change---perhaps in the year 2048.)
The next two parameters, |num| and |den|, are positive integers that define
the units of measurement; they are the numerator and denominator of a
fraction by which all dimensions in the \.{DVI} file could be multiplied
in order to get lengths in units of $10^{-7}$ meters. (For example, there are
exactly 7227 \TeX\ points in 254 centimeters, and \TeX82 works with scaled
points where there are $2^{16}$ sp in a point, so \TeX82 sets |num=25400000|
and $|den|=7227\cdot2^{16}=473628672$.)
@^sp@>
The |mag| parameter is what \TeX82 calls \.{\\mag}, i.e., 1000 times the
desired magnification. The actual fraction by which dimensions are
multiplied is therefore $mn/1000d$. Note that if a \TeX\ source document
does not call for any `\.{true}' dimensions, and if you change it only by
specifying a different \.{\\mag} setting, the \.{DVI} file that \TeX\
creates will be completely unchanged except for the value of |mag| in the
preamble and postamble. (Fancy \.{DVI}-reading programs allow users to
override the |mag|~setting when a \.{DVI} file is being printed.)
Finally, |k| and |x| allow the \.{DVI} writer to include a comment, which is not
interpreted further. The length of comment |x| is |k|, where |0<=k<256|.
@d dvi_id=2 {identifies the kind of \.{DVI} files described here}
@ Font definitions for a given font number |k| contain further parameters
$$\hbox{|c[4]| |s[4]| |d[4]| |a[1]| |l[1]| |n[a+l]|.}$$
The four-byte value |c| is the check sum that \TeX\ (or whatever program
generated the \.{DVI} file) found in the \.{TFM} file for this font;
|c| should match the check sum of the font found by programs that read
this \.{DVI} file.
@^check sum@>
Parameter |s| contains a fixed-point scale factor that is applied to the
character widths in font |k|; font dimensions in \.{TFM} files and other
font files are relative to this quantity, which is always positive and
less than $2^{27}$. It is given in the same units as the other dimensions
of the \.{DVI} file. Parameter |d| is similar to |s|; it is the ``design
size,'' and (like~|s|) it is given in \.{DVI} units. Thus, font |k| is to be
used at $|mag|\cdot s/1000d$ times its normal size.
The remaining part of a font definition gives the external name of the font,
which is an ASCII string of length |a+l|. The number |a| is the length
of the ``area'' or directory, and |l| is the length of the font name itself;
the standard local system font area is supposed to be used when |a=0|.
The |n| field contains the area in its first |a| bytes.
Font definitions must appear before the first use of a particular font number.
Once font |k| is defined, it must not be defined again; however, we
shall see below that font definitions appear in the postamble as well as
in the pages, so in this sense each font number is defined exactly twice,
if at all. Like |nop| commands and \\{xxx} commands, font definitions can
appear before the first |bop|, or between an |eop| and a |bop|.
@ The last page in a \.{DVI} file is followed by `|post|'; this command
introduces the postamble, which summarizes important facts that \TeX\ has
accumulated about the file, making it possible to print subsets of the data
with reasonable efficiency. The postamble has the form
$$\vbox{\halign{\hbox{#\hfil}\cr
|post| |p[4]| |num[4]| |den[4]| |mag[4]| |l[4]| |u[4]| |s[2]| |t[2]|\cr
$\langle\,$font definitions$\,\rangle$\cr
|post_post| |q[4]| |i[1]| 223's$[{\G}4]$\cr}}$$
Here |p| is a pointer to the final |bop| in the file. The next three
parameters, |num|, |den|, and |mag|, are duplicates of the quantities that
appeared in the preamble.
Parameters |l| and |u| give respectively the height-plus-depth of the tallest
page and the width of the widest page, in the same units as other dimensions
of the file. These numbers might be used by a \.{DVI}-reading program to
position individual ``pages'' on large sheets of film or paper; however,
the standard convention for output on normal size paper is to position each
page so that the upper left-hand corner is exactly one inch from the left
and the top. Experience has shown that it is unwise to design \.{DVI}-to-printer
software that attempts cleverly to center the output; a fixed position of
the upper left corner is easiest for users to understand and to work with.
Therefore |l| and~|u| are often ignored.
Parameter |s| is the maximum stack depth (i.e., the largest excess of
|push| commands over |pop| commands) needed to process this file. Then
comes |t|, the total number of pages (|bop| commands) present.
The postamble continues with font definitions, which are any number of
\\{fnt\_def} commands as described above, possibly interspersed with |nop|
commands. Each font number that is used in the \.{DVI} file must be defined
exactly twice: Once before it is first selected by a \\{fnt} command, and once
in the postamble.
@ The last part of the postamble, following the |post_post| byte that
signifies the end of the font definitions, contains |q|, a pointer to the
|post| command that started the postamble. An identification byte, |i|,
comes next; this currently equals~2, as in the preamble.
The |i| byte is followed by four or more bytes that are all equal to
the decimal number 223 (i.e., @'337 in octal). \TeX\ puts out four to seven of
these trailing bytes, until the total length of the file is a multiple of
four bytes, since this works out best on machines that pack four bytes per
word; but any number of 223's is allowed, as long as there are at least four
of them. In effect, 223 is a sort of signature that is added at the very end.
@^Fuchs, David Raymond@>
This curious way to finish off a \.{DVI} file makes it feasible for
\.{DVI}-reading programs to find the postamble first, on most computers,
even though \TeX\ wants to write the postamble last. Most operating
systems permit random access to individual words or bytes of a file, so
the \.{DVI} reader can start at the end and skip backwards over the 223's
until finding the identification byte. Then it can back up four bytes, read
|q|, and move to byte |q| of the file. This byte should, of course,
contain the value 248 (|post|); now the postamble can be read, so the
\.{DVI} reader discovers all the information needed for typesetting the
pages. Note that it is also possible to skip through the \.{DVI} file at
reasonably high speed to locate a particular page, if that proves
desirable. This saves a lot of time, since \.{DVI} files used in production
jobs tend to be large.
Unfortunately, however, standard \PASCAL\ does not include the ability to
@^system dependencies@>
access a random position in a file, or even to determine the length of a file.
Almost all systems nowadays provide the necessary capabilities, so \.{DVI}
format has been designed to work most efficiently with modern operating systems.
As noted above, \.{\title} will limit itself to the restrictions of standard
\PASCAL\ if |random_reading| is defined to be |false|.
@* Virtual fonts.
Before we get into the details of \.{\title}, we need to know exactly
what \.{VF} files are. Inspired by earlier work of David~R. Fuchs back
@^Fuchs, David Raymond@>
in 1984, the form of such files was designed by Donald~E. Knuth in 1989.
@^Knuth, Donald Ervin@>
The idea behind \.{VF} files is that a general
interface mechanism is needed to switch between the myriad font
layouts provided by different suppliers of typesetting equipment.
Without such a mechanism, people must go to great lengths writing
inscrutable macros whenever they want to use typesetting conventions
based on one font layout in connection with actual fonts that have
another layout. This puts an extra burden on the typesetting system,
interfering with the other things it needs to do (like kerning,
hyphenation, and ligature formation).
These difficulties go away when we have a ``virtual font,''
i.e., a font that exists in a logical sense but not a physical sense.
A typesetting system like \TeX\ can do its job without knowing where the
actual characters come from; a device driver can then do its job by
letting a \.{VF} file tell what actual characters correspond to the
characters \TeX\ imagined were present. The actual characters
can be shifted and/or magnified and/or combined with other characters
from many different fonts. A virtual font can even make use of characters
from virtual fonts, including itself.
Virtual fonts also allow convenient character substitutions for proofreading
purposes, when fonts designed for one output device are unavailable on another.
@ A \.{VF} file is organized as a stream of 8-bit bytes, using conventions
borrowed from \.{DVI} and \.{PK} files. Thus, a device driver that knows
about \.{DVI} and \.{PK} format will already
contain most of the mechanisms necessary to process \.{VF} files.
A preamble
appears at the beginning, followed by a sequence of character definitions,
followed by a postamble. More precisely, the first byte of every \.{VF} file
must be the first byte of the following ``preamble command'':
\yskip\hang|pre| 247 |i[1]| |k[1]| |x[k]| |cs[4]| |ds[4]|.
Here |i| is the identification byte of \.{VF}, currently 202. The string
|x| is merely a comment, usually indicating the source of the \.{VF} file.
Parameters |cs| and |ds| are respectively the check sum and the design size
of the virtual font; they should match the first two words in the header of
the \.{TFM} file, as described above.
\yskip
After the |pre| command, the preamble continues with font definitions;
every font needed to specify ``actual'' characters in later
\\{set\_char} commands is defined here. The font definitions are
exactly the same in \.{VF} files as they are in \.{DVI} files, except
that the scaled size |s| is relative and the design size |d| is absolute:
\yskip\hang|fnt_def1| 243 |k[1]| |c[4]| |s[4]| |d[4]| |a[1]| |l[1]| |n[a+l]|.
Define font |k|, where |0<=k<256|.
\yskip\hang|@!fnt_def2| 244 |k[2]| |c[4]| |s[4]| |d[4]| |a[1]| |l[1]| |n[a+l]|.
Define font |k|, where |0<=k<65536|.
\yskip\hang|@!fnt_def3| 245 |k[3]| |c[4]| |s[4]| |d[4]| |a[1]| |l[1]| |n[a+l]|.
Define font |k|, where |0<=k<@t$2^{24}$@>|.
\yskip\hang|@!fnt_def4| 246 |k[4]| |c[4]| |s[4]| |d[4]| |a[1]| |l[1]| |n[a+l]|.
Define font |k|, where |@t$-2^{31}$@><=k<@t$2^{31}$@>|.
\yskip\noindent
These font numbers |k| are ``local''; they have no relation to font numbers
defined in the \.{DVI} file that uses this virtual font. The dimension~|s|,
which represents the scaled size of the local font being defined,
is a |fix_word| relative to the design size of the virtual font.
Thus if the local font is to be used at the same size
as the design size of the virtual font itself, |s| will be the
integer value $2^{20}$. The value of |s| must be positive and less than
$2^{24}$ (thus less than 16 when considered as a |fix_word|).
The dimension~|d| is a |fix_word| in units of printer's points; hence it
is identical to the design size found in the corresponding \.{TFM} file.
@d vf_id=202
@ The preamble is followed by zero or more character packets, where each
character packet begins with a byte that is $<243$. Character packets have
two formats, one long and one short:
\yskip\hang|long_char| 242 |pl[4]| |cc[4]| |tfm[4]| |dvi[pl]|. This long form
specifies a virtual character in the general case.
\yskip\hang|short_char0..short_char241|
|pl[1]| |cc[1]| |tfm[3]| |dvi[pl]|. This short form specifies a
virtual character in the common case
when |0<=pl<242| and |0<=cc<256| and $0\le|tfm|<2^{24}$.
\yskip\noindent
Here |pl| denotes the packet length following the |tfm| value; |cc| is
the character code; and |tfm| is the character width copied from the
\.{TFM} file for this virtual font. There should be at most one character
packet having any given |cc| code.
The |dvi| bytes are a sequence of complete \.{DVI} commands, properly
nested with respect to |push| and |pop|. All \.{DVI} operations are
permitted except |bop|, |eop|, and commands with opcodes |>=243|.
Font selection commands (|fnt_num0| through |fnt4|) must refer to fonts
defined in the preamble.
Dimensions that appear in the \.{DVI} instructions are analogous to
|fix_word| quantities; i.e., they are integer multiples of $2^{-20}$ times
the design size of the virtual font. For example, if the virtual font
has design size $10\,$pt, the \.{DVI} command to move down $5\,$pt
would be a \\{down} instruction with parameter $2^{19}$. The virtual font
itself might be used at a different size, say $12\,$pt; then that
\\{down} instruction would move down $6\,$pt instead. Each dimension
must be less than $2^{24}$ in absolute value.
Device drivers processing \.{VF} files treat the sequences of |dvi| bytes
as subroutines or macros, implicitly enclosing them with |push| and |pop|.
Each subroutine begins with |w=x=y=z=0|, and with current font~|f| the
number of the first-defined in the preamble (undefined if there's no
such font). After the |dvi| commands have been
performed, the |h| and~|v| position registers of \.{DVI} format and the
current font~|f| are restored to their former values;
then, if the subroutine has been invoked by a \\{set\_char} or \\{set}
command, |h|~is increased by the \.{TFM} width
(properly scaled)---just as if a simple character had been typeset.
@d long_char=242 {\.{VF} command for general character packet}
@d improper_DVI_for_VF==139,140,243,244,245,246,247,248,249,250,251,252,
253,254,255
@ The character packets are followed by a trivial postamble, consisting of
one or more bytes all equal to |post| (248). The total number of bytes
in the file should be a multiple of~4.
@* Font metric data.
The idea behind \.{TFM} files is that typesetting routines like \TeX\
need a compact way to store the relevant information about several
dozen fonts, and computer centers need a compact way to store the
relevant information about several hundred fonts. \.{TFM} files are
compact, and most of the information they contain is highly relevant,
so they provide a solution to the problem.
The information in a \.{TFM} file appears in a sequence of 8-bit bytes.
Since the number of bytes is always a multiple of 4, we could
also regard the file as a sequence of 32-bit words; but \TeX\ uses the
byte interpretation, and so does \.{\title}. Note that the bytes
are considered to be unsigned numbers.
@ The first 24 bytes (6 words) of a \.{TFM} file contain twelve 16-bit
integers that give the lengths of the various subsequent portions
of the file. These twelve integers are, in order:
$$\vbox{\halign{\hfil#&$\null=\null$#\hfil\cr
|@!lf|&length of the entire file, in words;\cr
|@!lh|&length of the header data, in words;\cr
|@!bc|&smallest character code in the font;\cr
|@!ec|&largest character code in the font;\cr
|@!nw|&number of words in the width table;\cr
|@!nh|&number of words in the height table;\cr
|@!nd|&number of words in the depth table;\cr
|@!ni|&number of words in the italic correction table;\cr
|@!nl|&number of words in the lig/kern table;\cr
|@!nk|&number of words in the kern table;\cr
|@!ne|&number of words in the extensible character table;\cr
|@!np|&number of font parameter words.\cr}}$$
They are all nonnegative and less than $2^{15}$. We must have |bc-1<=ec<=255|,
|ne<=256|, and
$$\hbox{|lf=6+lh+(ec-bc+1)+nw+nh+nd+ni+nl+nk+ne+np|.}$$
Note that a font may contain as many as 256 characters (if |bc=0| and |ec=255|),
and as few as 0 characters (if |bc=ec+1|).
Incidentally, when two or more 8-bit bytes are combined to form an integer of
16 or more bits, the most significant bytes appear first in the file.
This is called BigEndian order.
@ The rest of the \.{TFM} file may be regarded as a sequence of ten data
arrays having the informal specification
$$\def\arr$[#1]#2${\&{array} $[#1]$ \&{of} #2}
\vbox{\halign{\hfil\\{#}&$\,:\,$\arr#\hfil\cr
header&|[0..lh-1]stuff|\cr
char\_info&|[bc..ec]char_info_word|\cr
width&|[0..nw-1]fix_word|\cr
height&|[0..nh-1]fix_word|\cr
depth&|[0..nd-1]fix_word|\cr
italic&|[0..ni-1]fix_word|\cr
lig\_kern&|[0..nl-1]lig_kern_command|\cr
kern&|[0..nk-1]fix_word|\cr
exten&|[0..ne-1]extensible_recipe|\cr
param&|[1..np]fix_word|\cr}}$$
The most important data type used here is a |@!fix_word|, which is
a 32-bit representation of a binary fraction. A |fix_word| is a signed
quantity, with the two's complement of the entire word used to represent
negation. Of the 32 bits in a |fix_word|, exactly 12 are to the left of the
binary point; thus, the largest |fix_word| value is $2048-2^{-20}$, and
the smallest is $-2048$. We will see below, however, that all but one of
the |fix_word| values will lie between $-16$ and $+16$.
@ The first data array is a block of header information, which contains
general facts about the font. The header must contain at least two words,
and for \.{TFM} files to be used with Xerox printing software it must
contain at least 18 words, allocated as described below. When different
kinds of devices need to be interfaced, it may be necessary to add further
words to the header block.
\yskip\hang|header[0]| is a 32-bit check sum that \TeX\ will copy into the
\.{DVI} output file whenever it uses the font. Later on when the \.{DVI}
file is printed, possibly on another computer, the actual font that gets
used is supposed to have a check sum that agrees with the one in the
\.{TFM} file used by \TeX. In this way, users will be warned about
potential incompatibilities. (However, if the check sum is zero in either
the font file or the \.{TFM} file, no check is made.) The actual relation
between this check sum and the rest of the \.{TFM} file is not important;
the check sum is simply an identification number with the property that
incompatible fonts almost always have distinct check sums.
@^check sum@>
\yskip\hang|header[1]| is a |fix_word| containing the design size of the
font, in units of \TeX\ points (7227 \TeX\ points = 254 cm). This number
must be at least 1.0; it is fairly arbitrary, but usually the design size
is 10.0 for a ``10 point'' font, i.e., a font that was designed to look
best at a 10-point size, whatever that really means. When a \TeX\ user
asks for a font `\.{at} $\delta$ \.{pt}', the effect is to override the
design size and replace it by $\delta$, and to multiply the $x$ and~$y$
coordinates of the points in the font image by a factor of $\delta$
divided by the design size. {\sl All other dimensions in the\/\ \.{TFM}
file are |fix_word|\kern-1pt\ numbers in design-size units.} Thus, for example,
the value of |param[6]|, one \.{em} or \.{\\quad}, is often the |fix_word|
value $2^{20}=1.0$, since many fonts have a design size equal to one em.
The other dimensions must be less than 16 design-size units in absolute
value; thus, |header[1]| and |param[1]| are the only |fix_word| entries in
the whole \.{TFM} file whose first byte might be something besides 0 or
255. @^design size@>
\yskip\hang|header[2..11]|, if present, contains 40 bytes that identify
the character coding scheme. The first byte, which must be between 0 and
39, is the number of subsequent ASCII bytes actually relevant in this
string, which is intended to specify what character-code-to-symbol
convention is present in the font. Examples are \.{ASCII} for standard
ASCII, \.{TeX text} for fonts like \.{cmr10} and \.{cmti9}, \.{TeX math
extension} for \.{cmex10}, \.{XEROX text} for Xerox fonts, \.{GRAPHIC} for
special-purpose non-alphabetic fonts, \.{UNSPECIFIED} for the default case
when there is no information. Parentheses should not appear in this name.
(Such a string is said to be in {\mc BCPL} format.)
@^coding scheme@>
\yskip\hang|header[12..16]|, if present, contains 20 bytes that name the
font family (e.g., \.{CMR} or \.{HELVETICA}), in {\mc BCPL} format.
This field is also known as the ``font identifier.''
@^family name@>
@^font identifier@>
\yskip\hang|header[17]|, if present, contains a first byte called the
|seven_bit_safe_flag|, then two bytes that are ignored, and a fourth byte
called the |face|. If the value of the fourth byte is less than 18, it has
the following interpretation as a ``weight, slope, and expansion'': Add 0
or 2 or 4 (for medium or bold or light) to 0 or 1 (for roman or italic) to
0 or 6 or 12 (for regular or condensed or extended). For example, 13 is
0+1+12, so it represents medium italic extended. A three-letter code
(e.g., \.{MIE}) can be used for such |face| data.
\yskip\hang|header[18..@twhatever@>]| might also be present; the individual
words are simply called |header[18]|, |header[19]|, etc., at the moment.
@ Next comes the |char_info| array, which contains one |char_info_word|
per character. Each |char_info_word| contains six fields packed into
four bytes as follows.
\yskip\hang first byte: |width_index| (8 bits)\par
\hang second byte: |height_index| (4 bits) times 16, plus |depth_index|
(4~bits)\par
\hang third byte: |italic_index| (6 bits) times 4, plus |tag|
(2~bits)\par
\hang fourth byte: |remainder| (8 bits)\par
\yskip\noindent
The actual width of a character is |width[width_index]|, in design-size
units; this is a device for compressing information, since many characters
have the same width. Since it is quite common for many characters
to have the same height, depth, or italic correction, the \.{TFM} format
imposes a limit of 16 different heights, 16 different depths, and
64 different italic corrections.
Incidentally, the relation |width[0]=height[0]=depth[0]=italic[0]=0|
should always hold, so that an index of zero implies a value of zero.
The |width_index| should never be zero unless the character does
not exist in the font, since a character is valid if and only if it lies
between |bc| and |ec| and has a nonzero |width_index|.
@ The |tag| field in a |char_info_word| has four values that explain how to
interpret the |remainder| field.
\yskip\hang|tag=0| (|no_tag|) means that |remainder| is unused.\par
\hang|tag=1| (|lig_tag|) means that this character has a ligature/kerning
program starting at |lig_kern[remainder]|.\par
\hang|tag=2| (|list_tag|) means that this character is part of a chain of
characters of ascending sizes, and not the largest in the chain. The
|remainder| field gives the character code of the next larger character.\par
\hang|tag=3| (|ext_tag|) means that this character code represents an
extensible character, i.e., a character that is built up of smaller pieces
so that it can be made arbitrarily large. The pieces are specified in
|exten[remainder]|.\par
@d no_tag=0 {vanilla character}
@d lig_tag=1 {character has a ligature/kerning program}
@d list_tag=2 {character has a successor in a charlist}
@d ext_tag=3 {character is extensible}
@ The |lig_kern| array contains instructions in a simple programming language
that explains what to do for special letter pairs. Each word is a
|lig_kern_command| of four bytes.
\yskip\hang first byte: |skip_byte|, indicates that this is the final program
step if the byte is 128 or more, otherwise the next step is obtained by
skipping this number of intervening steps.\par
\hang second byte: |next_char|, ``if |next_char| follows the current character,
then perform the operation and stop, otherwise continue.''\par
\hang third byte: |op_byte|, indicates a ligature step if less than~128,
a kern step otherwise.\par
\hang fourth byte: |remainder|.\par
\yskip\noindent
In a kern step, an
additional space equal to |kern[256*(op_byte-128)+remainder]| is inserted
between the current character and |next_char|. This amount is
often negative, so that the characters are brought closer together
by kerning; but it might be positive.
There are eight kinds of ligature steps, having |op_byte| codes $4a+2b+c$ where
$0\le a\le b+c$ and $0\le b,c\le1$. The character whose code is
|remainder| is inserted between the current character and |next_char|;
then the current character is deleted if $b=0$, and |next_char| is
deleted if $c=0$; then we pass over $a$~characters to reach the next
current character (which may have a ligature/kerning program of its own).
Notice that if $a=0$ and $b=1$, the current character is unchanged; if
$a=b$ and $c=1$, the current character is changed but the next character is
unchanged.
If the very first instruction of the |lig_kern| array has |skip_byte=255|,
the |next_char| byte is the so-called right boundary character of this font;
the value of |next_char| need not lie between |bc| and~|ec|.
If the very last instruction of the |lig_kern| array has |skip_byte=255|,
there is a special ligature/kerning program for a left boundary character,
beginning at location |256*op_byte+remainder|.
The interpretation is that \TeX\ puts implicit boundary characters
before and after each consecutive string of characters from the same font.
These implicit characters do not appear in the output, but they can affect
ligatures and kerning.
If the very first instruction of a character's |lig_kern| program has
|skip_byte>128|, the program actually begins in location
|256*op_byte+remainder|. This feature allows access to large |lig_kern|
arrays, because the first instruction must otherwise
appear in a location |<=255|.
Any instruction with |skip_byte>128| in the |lig_kern| array must have
|256*op_byte+remainder<nl|. If such an instruction is encountered during
normal program execution, it denotes an unconditional halt; no ligature
command is performed.
@d stop_flag=128 {value indicating `\.{STOP}' in a lig/kern program}
@d kern_flag=128 {op code for a kern step}
@ Extensible characters are specified by an |extensible_recipe|,
which consists of four bytes called |top|, |mid|,
|bot|, and |rep| (in this order). These bytes are the character codes
of individual pieces used to build up a large symbol.
If |top|, |mid|, or |bot| are zero,
they are not present in the built-up result. For example, an extensible
vertical line is like an extensible bracket, except that the top and
bottom pieces are missing.
@ The final portion of a \.{TFM} file is the |param| array, which is another
sequence of |fix_word| values.
\yskip\hang|param[1]=@!slant| is the amount of italic slant, which is used
to help position accents. For example, |slant=.25| means that when you go
up one unit, you also go .25 units to the right. The |slant| is a pure
number; it's the only |fix_word| other than the design size itself that is
not scaled by the design size.
\hang|param[2]=space| is the normal spacing between words in text.
Note that character |" "| in the font need not have anything to do with
blank spaces.
\hang|param[3]=space_stretch| is the amount of glue stretching between words.
\hang|param[4]=space_shrink| is the amount of glue shrinking between words.
\hang|param[5]=x_height| is the height of letters for which accents don't
have to be raised or lowered.
\hang|param[6]=quad| is the size of one em in the font.
\hang|param[7]=extra_space| is the amount added to |param[2]| at the
ends of sentences.
When the character coding scheme is \.{TeX math symbols}, the font is
supposed to have 15 additional parameters called |num1|, |num2|, |num3|,
|denom1|, |denom2|, |sup1|, |sup2|, |sup3|, |sub1|, |sub2|, |supdrop|,
|subdrop|, |delim1|, |delim2|, and |axis_height|, respectively. When the
character coding scheme is \.{TeX math extension}, the font is supposed to
have six additional parameters called |default_rule_thickness| and
|big_op_spacing1| through |big_op_spacing5|.
@* Binary data and binary files.
We have seen that a \.{DVI}, \.{VF}, or \.{TFM} file is a sequence of
8-bit bytes.
The bytes appear physically in what is called a `|packed file of 0..255|'
in \PASCAL\ lingo. One, two, three, or four consecutive bytes are often
interpreted as (signed or unsigned) integers.
We might as well define the corresponding data types.
@!@^system dependencies@>
@<Types...@>=
@!signed_byte=-@"80..@"7F; {signed one-byte quantity}
@!eight_bits=0..@"FF; {unsigned one-byte quantity}
@!signed_pair=-@"8000..@"7FFF; {signed two-byte quantity}
@!sixteen_bits=0..@"FFFF; {unsigned two-byte quantity}
@!signed_trio=-@"800000..@"7FFFFF; {signed three-byte quantity}
@!twentyfour_bits=0..@"FFFFFF; {unsigned three-byte quantity}
@!signed_quad=int_32; {signed four-byte quantity}
@ Packing is system dependent, and many \PASCAL\ systems fail to implement
such files in a sensible way (at least, from the viewpoint of producing
good production software). For example, some systems treat all
byte-oriented files as text, looking for end-of-line marks and such
things. Therefore some system-dependent code is often needed to deal with
binary files, even though most of the program in this section of
\.{\title} is written in standard \PASCAL.
@^system dependencies@>
One common way to solve the problem is to consider files of |integer|
numbers, and to convert an integer in the range $-2^{31}\L x<2^{31}$ to
a sequence of four bytes $(a,b,c,d)$ using the following code, which
avoids the controversial integer division of negative numbers:
$$\vbox{\halign{#\hfil\cr
|if x>=0 then a:=x div @'100000000|\cr
|else begin x:=(x+@'10000000000)+@'10000000000; a:=x div @'100000000+128;|\cr
\quad|end|\cr
|x:=x mod @'100000000;|\cr
|b:=x div @'200000; x:=x mod @'200000;|\cr
|c:=x div @'400; d:=x mod @'400;|\cr}}$$
The four bytes are then kept in a buffer and output one by one. (On 36-bit
computers, an additional division by 16 is necessary at the beginning.
Another way to separate an integer into four bytes is to use/abuse
\PASCAL's variant records, storing an integer and retrieving bytes that are
packed in the same place; {\sl caveat implementor!\/}) It is also desirable
in some cases to read a hundred or so integers at a time, maintaining a
larger buffer.
@ We shall stick to simple \PASCAL\ in the standard version of this program,
for reasons of clarity, even if such simplicity is sometimes unrealistic.
@<Types...@>=
@!byte_file=packed file of eight_bits; {files that contain binary data}
@ Character packets extracted from \.{VF} files will be stored in a large
array |byte_mem|. Other packets of bytes, e.g., character packets
extracted from a \.{GF} or \.{PK} or \.{PXL} file could be stored in the
same way. A `|pckt_pointer|' variable, which signifies a packet,
is an index into another array |pckt_start|. The actual sequence of bytes
in the packet pointed to by |p| appears in positions |pckt_start[p]| to
|pckt_start[p+1]-1|, inclusive, in |byte_mem|.
Packets will also be used to store sequences of |ASCII_code|s; in this
respect the |byte_mem| array is very similar to \TeX's string pool and
part of the following code has, in fact, been copied more or less
verbatim from \TeX.
In other respects the packets resemble the identifiers used by
\.{TANGLE} and \.{WEAVE} (also stored in an array called |byte_mem|)
since there is, in general, at most one packet with a given contents;
thus part of the code below has been adapted from the corresponding code
in these programs.
Some \PASCAL\ compilers won't pack integers into a single byte unless the
integers lie in the range |-128..127|. To accommodate such systems we
access the array |byte_mem| only via macros that can easily be redefined.
@^system dependencies@>
@d bi(#) == # {convert from |eight_bits| to |packed_byte|}
@d bo(#) == # {convert from |packed_byte| to |eight_bits|}
@<Types...@>=
@!packed_byte = eight_bits; {elements of |byte_mem| array}
@!byte_pointer = 0..max_bytes; {an index into |byte_mem|}
@!pckt_pointer = 0..max_packets; {an index into |pckt_start|}
@ The global variable |byte_ptr| points to the first unused location in
|byte_mem| and |pckt_ptr| points to the first unused location in
|pckt_start|.
@<Globals...@>=
@!byte_mem: packed array [byte_pointer] of packed_byte; {bytes of packets}
@!pckt_start: array [pckt_pointer] of byte_pointer;
{directory into |byte_mem|}
@!byte_ptr: byte_pointer;
@!pckt_ptr: pckt_pointer;
@ Several of the elementary operations with packets are performed using
\.{WEB} macros instead of \PASCAL\ procedures, because many of the
operations are done quite frequently and we want to avoid the
overhead of procedure calls. For example, here is
a simple macro that computes the length of a packet.
@.WEB@>
@d pckt_length(#)==(pckt_start[#+1]-pckt_start[#]) {the number of bytes
in packet number \#}
@ Packets are created by appending bytes to |byte_mem|.
The |append_byte| macro, defined here, does not check to see if the
value of |byte_ptr| has gotten too high; this test is supposed to be
made before |append_byte| is used. There is also a |flush_byte|
macro, which erases the last byte appended.
To test if there is room to append |l| more bytes to |byte_mem|,
we shall write |pckt_room(l)|, which aborts \.{\title} and gives an
apologetic error message if there isn't enough room.
@d append_byte(#) == {put byte \# at the end of |byte_mem|}
begin byte_mem[byte_ptr]:=bi(#); incr(byte_ptr);
@d flush_byte == decr(byte_ptr) {forget the last byte in |byte_mem|}
@d pckt_room(#) == {make sure that |byte_mem| hasn't overflowed}
begin if byte_ptr+# > max_bytes then overflow(str_bytes,max_bytes);
end
@d append_one(#) ==
begin pckt_room(1); append_byte(#);
@ The length of the current packet is called |cur_pckt_length|:
@d cur_pckt_length == (byte_ptr - pckt_start[pckt_ptr])
@ Once a sequence of bytes has been appended to |byte_mem|, it
officially becomes a packet when the function |make_packet| is called.
This function returns as its value the identification number of either
an existing packet with the same contents or, if no such packet exists,
of the new packet. Thus two packets have the same contents if and only
if they have the same identification number. In order to locate the
packet with a given contents, or to find out that no such packet exists,
we need a hash table. The hash table is kept by the method of simple
chaining, where the heads of the individual lists appear in the |p_hash|
array. If |h| is a hash code, the hash table list starts at |p_hash[h]|
and proceeds through |p_link| pointers.
@d hash_size=353 {should be prime, must be |>256|}
@<Types...@>=
@!hash_code=0..hash_size;
@ @<Glob...@>=
@!p_link:array[pckt_pointer] of pckt_pointer; {hash table}
@!p_hash:array[hash_code] of pckt_pointer;
@ Initially |byte_mem| and all the hash lists are empty; |empty_packet|
is the empty packet.
@d empty_packet=0 {the empty packet}
@<Set init...@>=
pckt_ptr:=1; byte_ptr:=1;
pckt_start[0]:=1; pckt_start[1]:=1;
for h:=0 to hash_size-1 do p_hash[h]:=0;
@ @<Local variables for init...@>=
@!h:hash_code; {index into hash-head arrays}
@ Here now is the procedure for finding packets (and strings).
@p function make_packet:pckt_pointer;
label found;
var i,@!k:byte_pointer; {indices into |byte_mem|}
@!h:hash_code; {hash code}
@!s,@!l:byte_pointer; {start and length of the given packet}
@!p:pckt_pointer; {where the packet is being sought}
begin s:=pckt_start[pckt_ptr]; l:=byte_ptr-s; {compute start and length}
if l=0 then p:=empty_packet
else begin @<Compute the packet hash code |h|@>;
@<Compute the packet location |p|@>;
if pckt_ptr=max_packets then overflow(str_packets,max_packets);
incr(pckt_ptr); pckt_start[pckt_ptr]:=byte_ptr;
end;
found:make_packet:=p;
@ A simple hash code is used: If the sequence of bytes is
$b_1b_2\ldots b_n$, its hash value will be
$$(2^{n-1}b_1+2^{n-2}b_2+\cdots+b_n)\,\bmod\,|hash_size|.$$
@<Compute the packet hash...@>=
h:=bo(byte_mem[s]); i:=s+1;
while i<byte_ptr do
begin h:=(h+h+bo(byte_mem[i])) mod hash_size; incr(i);
end
@ If the packet is new, it will be placed in position |p=pckt_ptr|,
otherwise |p| will point to its existing location.
@<Compute the packet location...@>=
p:=p_hash[h];
while p<>0 do
begin if pckt_length(p)=l then
@<Compare packet |p| with current packet, |goto found| if equal@>;
p:=p_link[p];
end;
p:=pckt_ptr; {the current packet is new}
p_link[p]:=p_hash[h]; p_hash[h]:=p {insert |p| at beginning of hash list}
@ @<Compare packet |p|...@>=
begin i:=s; k:=pckt_start[p];
while (i<byte_ptr)and(byte_mem[i]=byte_mem[k]) do
begin incr(i); incr(k);
end;
if i=byte_ptr then {all bytes agree}
begin byte_ptr:=pckt_start[pckt_ptr]; goto found;
end;
@ Some packets are initialized with predefined strings of |ASCII_code|s;
a few macros permit us to do the initialization with a compact program.
Since this initialization is done when |byte_mem| is still empty, and
since |byte_mem| is supposed to be large enough for all the predefined
strings, |pckt_room| is used only if we are debugging.
@d pid0(#)==#:=make_packet
@d pid1(#)==byte_mem[byte_ptr-1]:=bi(#); pid0
@d pid2(#)==byte_mem[byte_ptr-2]:=bi(#); pid1
@d pid3(#)==byte_mem[byte_ptr-3]:=bi(#); pid2
@d pid4(#)==byte_mem[byte_ptr-4]:=bi(#); pid3
@d pid5(#)==byte_mem[byte_ptr-5]:=bi(#); pid4
@d pid6(#)==byte_mem[byte_ptr-6]:=bi(#); pid5
@d pid7(#)==byte_mem[byte_ptr-7]:=bi(#); pid6
@d pid8(#)==byte_mem[byte_ptr-8]:=bi(#); pid7
@d pid9(#)==byte_mem[byte_ptr-9]:=bi(#); pid8
@d pid10(#)==byte_mem[byte_ptr-10]:=bi(#); pid9
@d pid_init(#)==
@!debug pckt_room(#); @+ gubed @;
Incr(byte_ptr)(#)
@d id1==pid_init(1); pid1
@d id2==pid_init(2); pid2
@d id3==pid_init(3); pid3
@d id4==pid_init(4); pid4
@d id5==pid_init(5); pid5
@d id6==pid_init(6); pid6
@d id7==pid_init(7); pid7
@d id8==pid_init(8); pid8
@d id9==pid_init(9); pid9
@d id10==pid_init(10); pid10
@ Here we initialize some strings used as argument of the |overflow| and
|confusion| procedures.
@<Initialize predefined strings@>=
id5("f")("o")("n")("t")("s")(str_fonts);
id5("c")("h")("a")("r")("s")(str_chars);
id6("w")("i")("d")("t")("h")("s")(str_widths);
id7("p")("a")("c")("k")("e")("t")("s")(str_packets);
id5("b")("y")("t")("e")("s")(str_bytes);
id9("r")("e")("c")("u")("r")("s")("i")("o")("n")(str_recursion);
id5("s")("t")("a")("c")("k")(str_stack);
id10("n")("a")("m")("e")("l")("e")("n")("g")("t")("h")(str_name_length);
@ @<Glob...@>=
@!str_fonts,@!str_chars,@!str_widths,@!str_packets,@!str_bytes,
@!str_recursion,@!str_stack,@!str_name_length:pckt_pointer;
@ Some packets, e.g., the preamble comments of \.{DVI} and \.{VF} files,
are needed only temporarily. In such cases |new_packet| is used to
create a packet (which might duplicate an existing packet) and
|flush_packet| is used to discard it; the calls to |new_packet| and
|flush_packet| must occur in balanced pairs, without any intervening
calls to |make_packet|.
@p function new_packet: pckt_pointer;
begin if pckt_ptr=max_packets then overflow(str_packets,max_packets);
new_packet:=pckt_ptr; incr(pckt_ptr); pckt_start[pckt_ptr]:=byte_ptr;
procedure flush_packet;
begin decr(pckt_ptr); byte_ptr:=pckt_start[pckt_ptr];
@ The |print_packet| procedure prints the contents of a packet; such a
packets should, of course, consists of a sequence of |ASCII_code|s.
@<Basic printing...@>=
procedure print_packet(p:pckt_pointer);
var k:byte_pointer;
begin for k:=pckt_start[p] to pckt_start[p+1]-1 do
print(xchr[bo(byte_mem[k])]);
@ When we interpret a packet we will use two (global or local) variables:
|cur_loc| will point to the byte to be used next, and |cur_limit| will
point to the start of the next packet. The macro |pckt_extract| will be
used to extract one byte; it should, however, never be used with
|cur_loc>=cur_limit|.
@d pckt_extract(#) ==
@!debug if cur_loc>=cur_limit then confusion(str_packets) @+ else @/
gubed @;
begin #:=bo(byte_mem[cur_loc]); incr(cur_loc); @+ end
@<Globals...@>=
@!cur_pckt: pckt_pointer; {the current packet}
@!cur_loc: byte_pointer; {current location in a packet}
@!cur_limit: byte_pointer; {start of next packet}
@ We will need routines to extract one, two, three, or four bytes from
|byte_mem|, from the \.{DVI} file, or from a \.{VF} file and assemble
them into (signed or unsigned) integers and these routines should be
optimized for efficiency. Here we define \.{WEB} macros to be used for
the body of these routines; thus the changes for system dependent
optimization have to be applied only once.
@^system dependencies@>
@^optimization@>
In addition we demonstrates how these macros can be used to define
functions that extract one, two, three, or four bytes from a character
packet and assemble them into signed or unsigned integers (assuming that
|cur_loc| and |cur_limit| are initialized suitably).
@d begin_byte(#) ==
var a:eight_bits;
begin #(a)
@d comp_sbyte(#) == if a<128 then #:=a @+ else #:=a-256
@d comp_ubyte(#) == #:=a
@f begin_byte == begin
@p function pckt_sbyte:int_8; {returns the next byte, signed}
@!begin_byte(pckt_extract); comp_sbyte(pckt_sbyte);
function pckt_ubyte:int_8u; {returns the next byte, unsigned}
@!begin_byte(pckt_extract); comp_ubyte(pckt_ubyte);
@ @d begin_pair(#) ==
var a,@!b:eight_bits;
begin #(a); #(b)
@d comp_spair(#) == if a<128 then #:=a*256+b @+ else #:=(a-256)*256+b
@d comp_upair(#) == #:=a*256+b
@f begin_pair == begin
@p function pckt_spair:int_16; {returns the next two bytes, signed}
@!begin_pair(pckt_extract); comp_spair(pckt_spair);
function pckt_upair:int_16u; {returns the next two bytes, unsigned}
@!begin_pair(pckt_extract); comp_upair(pckt_upair);
@ @d begin_trio(#) ==
var a,@!b,@!c:eight_bits;
begin #(a); #(b); #(c)
@d comp_strio(#) ==
if a<128 then #:=(a*256+b)*256+c @+ else #:=((a-256)*256+b)*256+c
@d comp_utrio(#) == #:=(a*256+b)*256+c
@f begin_trio == begin
@p function pckt_strio:int_24; {returns the next three bytes, signed}
@!begin_trio(pckt_extract); comp_strio(pckt_strio);
function pckt_utrio:int_24u; {returns the next three bytes, unsigned}
@!begin_trio(pckt_extract); comp_utrio(pckt_utrio);
@ @d begin_quad(#) ==
var a,@!b,@!c,@!d:eight_bits;
begin #(a); #(b); #(c); #(d)
@d comp_squad(#) ==
if a<128 then #:=((a*256+b)*256+c)*256+d
else #:=(((a-256)*256+b)*256+c)*256+d
@f begin_quad == begin
@p function pckt_squad:int_32; {returns the next four bytes, signed}
@!begin_quad(pckt_extract); comp_squad(pckt_squad);
@ A similar set of routines is needed for the inverse task of
decomposing a \.{DVI} command into a sequence of bytes to be appended
to |byte_mem| or, in the case of \.{DVIcopy}, to be written to the
output file. Again we define \.{WEB} macros to be used for the body
of these routines; thus the changes for system dependent optimization
have to be applied only once.
@^system dependencies@>
@^optimization@>
First, the |pckt_four| procedure outputs four bytes in two's complement
notation, without risking arithmetic overflow.
@d begin_four == begin
@d comp_four(#) ==
if x>=0 then #(x div @"1000000)
else begin Incr(x)(@"40000000); Incr(x)(@"40000000);
#((x div @"1000000) + 128);
end;
x:=x mod @"1000000; #(x div @"10000);
x:=x mod @"10000; #(x div @"100);
#(x mod @"100)
@f begin_four == begin
@p procedure pckt_four(@!x:int_32); {output four bytes}
@!begin_four; pckt_room(4); comp_four(append_byte);
@ Next, the |pckt_char| procedure outputs a |set_char| or \\{set} command
or, if |upd=false|, a |put| command.
@d begin_char ==
var o:eight_bits; {|set1| or |put1|}
begin
@d comp_char(#) ==
if (not upd)or(res>127)or(ext<>0) then
begin o:=dvi_char_cmd[upd]; {|set1| or |put1|}
if ext<0 then Incr(ext)(@"1000000);
if ext=0 then #(o) @+ else @;
begin if ext<@"100 then #(o+1) @+ else @;
begin if ext<@"10000 then #(o+2) @+ else @;
begin #(o+3); #(ext div @"10000); ext:=ext mod @"10000;
end;
#(ext div @"100); ext:=ext mod @"100;
end;
#(ext);
end;
end;
#(res)
@f begin_char == begin
@p procedure pckt_char(@!upd:boolean;@!ext:int_32;@!res:eight_bits);
{output \\{set} or |put|}
@!begin_char; pckt_room(5); comp_char(append_byte);
@ Then, the |pckt_unsigned| procedure outputs a |fnt| or |xxx|
command with its first parameter (normally unsigned); a |fnt| command
is converted into |fnt_num| whenever this is possible.
@d begin_unsigned == begin
@d comp_unsigned(#) ==
if (x<@"100)and(x>=0) then
if (o=fnt1)and(x<64) then Incr(x)(fnt_num_0) @+ else #(o)
begin if (x<@"10000)and(x>=0) then #(o+1) @+ else @;
begin if (x<@"1000000)and(x>=0) then #(o+2) @+ else @;
begin #(o+3);
if x>=0 then #(x div @"1000000)
else begin Incr(x)(@"40000000); Incr(x)(@"40000000);
#((x div @"1000000) + 128);
end;
x:=x mod @"1000000;
end;
#(x div @"10000); x:=x mod @"10000;
end;
#(x div @"100); x:=x mod @"100;
end;
@f begin_unsigned == begin
@p procedure pckt_unsigned(@!o:eight_bits;@!x:int_32);
{output |fnt_num|, |fnt|, or |xxx|}
@!begin_unsigned; pckt_room(5); comp_unsigned(append_byte);
@ Finally, the |pckt_signed| procedure outputs a movement (|right|, |w|,
|x|, |down|, |y|, or |z|) command with its (signed) parameter.
@d begin_signed ==
var xx:int_31; {`absolute value' of |x|}
begin
@d comp_signed(#) ==
if x>=0 then xx:=x @+ else xx:=-(x+1);
if xx<@"80 then
begin #(o); @+ if x<0 then Incr(x)(@"100); @+ end
else begin if xx<@"8000 then
begin #(o+1); @+ if x<0 then Incr(x)(@"10000); @+ end
else begin if xx<@"800000 then
begin #(o+2); @+ if x<0 then Incr(x)(@"1000000); @+ end
else begin #(o+3);
if x>=0 then #(x div @"1000000)
else begin x:=@"7FFFFF-xx; #((x div @"1000000) + 128); @+ end;
x:=x mod @"1000000;
end;
#(x div @"10000); x:=x mod @"10000;
end;
#(x div @"100); x:=x mod @"100;
end;
@f begin_signed == begin
@p procedure pckt_signed(@!o:eight_bits;@!x:int_32);
{output |right|, |w|, |x|, |down|, |y|, or |z|}
@!begin_signed; pckt_room(5); comp_signed(append_byte);
@ The character code of each character to be typset will be decomposed
into a residue |0<=char_res<256| and extension:
|char_code=char_res+256*char_ext|; the \.{TFM} widths as well as the
pixel widths for a given resolution are the same for all characters in a
font with the same residue. A \.{VF} or \.{GF} or \.{PK} file may
contain information for several characters with the same residue but
with different extension; all except the first of the corresponding
packets in |byte_mem| will contain a pointer to the previous one and the
table of packet pointers for all legal characters in the font will point
to the last such packet.
A character packet in |byte_mem| starts with a flag byte
$$\hbox{|flag=@"40*ext_flag+@"20*chain_flag+type_flag|}$$
with |0<=ext_flag<=3|, |0<=chain_flag<=1|, |0<=type_flag<=@"1F|,
followed by |ext_flag| bytes with the character extension for this
packet and, if |chain_flag=1|, by a two byte packet pointer to the
previous packet for the same font and character residue. The actual
character packet follows after these header bytes and the
interpretation of the |type_flag| depends on whether this is a \.{VF} or
\.{GF} or \.{PK} packet.
The empty packet is interpreted as a special case of a packet with
|flag=0|.
@d ext_flag=@"40
@d chain_flag=@"20
@<Types...@>=
@!type_flag=0..chain_flag-1; {the range of values for the |type_flag|}
@ Given an extension, the pointer to the previous packet (if any), and
a type, the |start_packet| procedure stores these hader bytes in
|byte_mem|.
@d invalid_packet==max_packets {used when there is no packet}
@p procedure start_packet(@!ext:int_32;@!r:pckt_pointer;@!t:eight_bits);
label continue;
var p,@!q:pckt_pointer; {current and next packet}
@!f:eight_bits; {a flag byte}
@!e:int_24; {extension for a packet}
@!cur_loc: byte_pointer; {current location in a packet}
@!cur_limit: byte_pointer; {start of next packet}
begin
@!debug if byte_ptr<>pckt_start[pckt_ptr] then confusion(str_packets);
gubed @;@/
pckt_room(6);
if r=invalid_packet then f:=t
else begin q:=r; @<Locate character packet for extension |ext|@>;
if e=ext then dup_warning(ext);
f:=t+chain_flag
end;
if ext<0 then Incr(ext)(@"1000000);
if ext=0 then append_byte(f) @+ else @;
begin if ext<@"100 then append_byte(f+ext_flag) @+ else @;
begin if ext<@"10000 then append_byte(f+ext_flag+ext_flag) @+ else @;
begin append_byte(f+ext_flag+ext_flag+ext_flag);
append_byte(ext div @"10000); ext:=ext mod @"10000;
end;
append_byte(ext div @"100); ext:=ext mod @"100;
end;
append_byte(ext);
end;
if r<>invalid_packet then
begin append_byte(r div @"100); append_byte(r mod @"100);
end;
@ Given the pointer |q| to a non-empty chain of character packets, the
|find_packet| function locates the packet for extension |ext|; if no
such packet is found, a warning message is given and the first packet
(the last one in the chain) is used. The functions sets |cur_pckt|,
|cur_loc|, and |cur_limit| and returns the |type_flag| of the packet.
The |find_packet| function must not be called for an empty chain.
@p function find_packet(@!ext:int_24;@!q:pckt_pointer):type_flag;
label continue;
var p:pckt_pointer; {current packet}
@!f:eight_bits; {a flag byte}
@!e:int_24; {extension for a packet}
begin
@!debug if q=invalid_packet then confusion(str_packets);
gubed @;@/
@<Locate character packet for extension |ext|@>;
if e<>ext then subst_warning(e,ext);
cur_pckt:=p;
find_packet:=f;
@ The following code follows a non-empty chain of packets until either a
packet for the desired extension is found (|e=ext|) or the chain has
ended.
@<Locate character packet for extension |ext|@>=
continue: p:=q; cur_loc:=pckt_start[p]; cur_limit:=pckt_start[p+1];
if p=empty_packet then
begin e:=0; f:=0;
end
else begin pckt_extract(f);
case (f div ext_flag) of
0: e:=0;
1: e:=pckt_ubyte;
2: e:=pckt_upair;
3: e:=pckt_strio;
end;
if (f mod ext_flag)>=chain_flag then
begin q:=pckt_upair; if e<>ext then goto continue;
end;
f:=f mod chain_flag;
@ The |hex_packet| procedure prints the contents of a packet in
hexadecimal form.
@<Basic printing...@>=
@!debug procedure hex_packet(@!p:pckt_pointer); {prints a packet in hex}
var j,@!k,@!l:byte_pointer; {indices into |byte_mem|}
@!d:int_8u;
begin j:=pckt_start[p]-1; k:=pckt_start[p+1]-1;
print_ln(' packet=',p:1,' start=',j+1:1,' length=',k-j:1);
for l:=j+1 to k do
begin d:=(bo(byte_mem[l])) div 16;
if d<10 then print(xchr[d+"0"]) @+ else print(xchr[d-10+"A"]);
d:=(bo(byte_mem[l])) mod 16;
if d<10 then print(xchr[d+"0"]) @+ else print(xchr[d-10+"A"]);
if (l=k)or(((l-j) mod 16)=0) then print_ln(' ')
else if ((l-j) mod 4)=0 then print(' ')
else print(' ');
end;
gubed
@* File names.
The structure of file names is different for different systems; therefore
this part of the program will, in most cases, require system dependent
modifications. Here we assume that a file name consists of three parts:
an area or directory specifying where the file can be found, a name
proper and an extension; \.{\title} assumes that these three parts appear
in order stated above but this need not be true in all cases.
The font names extracted from \.{DVI} and \.{VF} files consist of an area
part and a name proper; these are stored as packets consisting of the
length of the area part followed by the area and the name proper.
When we print an external font name we simple print the area and the name
contained in the `file name packet' without delimiter between them.
This may need to be modified for some systems.
@^system dependencies@>
@<Basic printing...@>=
procedure print_font(@!f:font_number);
var p:pckt_pointer; {the font name packet}
@!k:byte_pointer; {index into |byte_mem|}
@!m:int_31; {font magnification}
begin print(' = '); p:=font_name(f);
for k:=pckt_start[p]+1 to pckt_start[p+1]-1 do
print(xchr[bo(byte_mem[k])]);
m:=round((font_scaled(f)/font_design(f))*dvi_mag);
if m<>1000 then print(' scaled ',m:1);
@ Before a font file can be opened for input we must build a string
with its external name.
@<Glob...@>=
@!cur_name:packed array[1..name_length] of char; {external name,
with no lower case letters}
@ For \.{TFM} and \.{VF} files we just append the apropriate extension
to the file name packet; in addition a system dependent area part
(usually different for \.{TFM} anf \.{VF} files) is prepended if
the file name packet contains no area part. For other font files (e.g.,
\.{GF} or \.{PK}) the extension or area part will in most cases depend on
the resolution of the output device (corrected for font magnification).
The |make_name| procedure used to build the external file name has three
parameters: the packets with the font name and the extension, and the
length of the default area which must be copied to |cur_name| before
|make_name| is called.
@^system dependencies@>
@p procedure make_name(@!n,@!e:pckt_pointer;@!r:int_15);
var b:eight_bits; {a byte extracted from |byte_mem|}
@!cur_loc,@!cur_limit:byte_pointer; {indices into |byte_mem|}
begin cur_loc:=pckt_start[n]; cur_limit:=pckt_start[n+1];
pckt_extract(b); {length of area part}
if b>0 then r:=0;
if r+(cur_limit-cur_loc)+pckt_length(e)>name_length then
overflow(str_name_length,name_length);
while cur_loc<cur_limit do
begin pckt_extract(b);
if (b>="a")and(b<="z") then b:=b-@'40; {convert to upper case}
incr(r); cur_name[r]:=xchr[b];
end;
cur_loc:=pckt_start[e]; cur_limit:=pckt_start[e+1];
while cur_loc<cur_limit do
begin pckt_extract(b);
incr(r); cur_name[r]:=xchr[b];
end;
while r<name_length do begin incr(r); cur_name[r]:=' '; @+ end;
@* Defining fonts.
\.{DVI} file format does not include information about character widths, since
that would tend to make the files a lot longer. But a program that reads
a \.{DVI} file is supposed to know the widths of the characters that appear
in \\{set\_char} commands. Therefore \.{\title} looks at the font metric
(\.{TFM}) files for the fonts that are involved.
@.TFM {\rm files}@>
The character-width data appears also in other files (e.g., in \.{VF} files
or in \.{GF} files that specify bit patterns for digitized characters);
thus, it is usually possible for \.{DVI} reading programs to get by with
accessing only one file per font. For \.{VF} reading programs there is,
however, a problem: (1)~when reading the character packets from a
\.{VF} file the \.{TFM} width for its local fonts should be known in
order to analyze and optimize the packets (e.g., determine if a packet
must indeed be enclosed with |push| and |pop| as implied by the \.{VF}
format); and (2)~ in order to avoid infinite recursion such programs
must not try to read a \.{VF} file for a font before a character from
that font is actually used. Thus \.{\title} reads the \.{TFM} file
whenever a new font is encountered and delays the decision whether this
is a virtual font or not.
@ For the moment, we need to know only two things about a
given character |c| in a given font |f|: (1)~Is |c| a legal character
in~|f|? (2)~If so, what is the width of |c|? We also need to know the
symbolic name of each font, so it can be printed out, and we need to know
the approximate size of inter-word spaces in each font.
The answers to these questions appear implicitly in the data structures
defined in the following sections.
@ Quite often a particular width value is shared by several characters in
a font or even by characters from different fonts; the later will
probably occur in particular for virtual fonts and the local fonts used
by them. Thus the array |widths| is used to store all different \.{TFM}
width values of all legal characters in all fonts; a variable of type
|width_pointer| is an index into |widths| or is zero if a characters does
not exist. If the output is for a real typesetting device the |pix_widths|
array contains the same width values converted to (horizontal) pixels.
In order to locate a given width value we use again a hash
table with simple chaining; this time the heads of the individual lists
appear in the |w_hash| array and the lists proceed through |w_link|
pointers.
@<Types...@>=
@!width_pointer=0..max_widths; {an index into |widths|}
@!device
@!pix_value=-@"8000..@"7FFF; {a pixel coordinate or displacement;
this range may not suffice for high resolution output devices}
ecived
@ @<Glob...@>=
@!widths:array[width_pointer] of int_32; {the different width values}
@!device
@!pix_widths:array[width_pointer] of pix_value; {the widths in pixels}
ecived @; @/
@!w_link:array[width_pointer] of width_pointer; {hash table}
@!w_hash:array[hash_code] of width_pointer;
@!n_widths:width_pointer; {first unoccupied position in |widths|}
@ Initially the |widths| array and all the hash lists are empty, except
for one entry: the width value zero; in addition we set |widths[0]:0|.
@d invalid_width=1 {width pointer for invalid characters}
@d zero_width=1 {a width pointer to the value zero}
@<Set init...@>=
w_hash[0]:=1; w_link[1]:=0; widths[0]:=0; widths[1]:=0; n_widths:=2;
@!device pix_widths[0]:=0; pix_widths[1]:=0; @+ ecived @;
for h:=1 to hash_size-1 do w_hash[h]:=0;
@ The function |make_width| returns an index into |widths| and, if
necessary, adds a new width value; thus two characters will have the
same |width_pointer| if and only if their widths agree.
@d h_pixel_round(#)==round(h_conv*(#))
@d v_pixel_round(#)==round(v_conv*(#))
@^system dependencies@>
@p function make_width(@!w:int_32):width_pointer;
label found;
var h:hash_code; {hash code}
@!p:width_pointer; {where the identifier is being sought}
@!x:int_16; {intermediate value}
begin widths[n_widths]:=w;
@<Compute the width hash code |h|@>;
@<Compute the width location |p|, |goto| found unless the value is new@>;
if n_widths=max_widths then overflow(str_widths,max_widths);
incr(n_widths);
@!device pix_widths[p]:=h_pixel_round(w); @+ ecived @;
found:make_width:=p;
@ A simple hash code is used: If the width value consists of the four
bytes $b_0b_1b_2b_3$, its hash value will be
$$(8*b_0+4*b_1+2*b_2+b_3)\,\bmod\,|hash_size|.$$
@<Compute the width hash...@>=
if w>=0 then x:=w div @"1000000
else begin w:=w+@"40000000; w:=w+@"40000000; x:=(w div @"1000000)+@"80;
end;
w:=w mod @"1000000; x:=x+x+(w div @"10000);
w:=w mod @"10000; x:=x+x+(w div @"100);
h:=(x+x+(w mod @"100)) mod hash_size
@ If the width is new, it has been placed into position |p=n_widths|,
otherwise |p| will point to its existing location.
@<Compute the width location...@>=
p:=w_hash[h];
while p<>0 do
begin if widths[p]=widths[n_widths] then goto found;
p:=w_link[p];
end;
p:=n_widths; {the current width is new}
w_link[p]:=w_hash[h]; w_hash[h]:=p {insert |p| at beginning of hash list}
@ The |char_widths| array is used to store the |width_pointer|s for all
different characters among all fonts. For a real typesetting device the
|char_pixels| array is used to store the horizontal character escapements:
Initially we use the |pix_widths| values, but these will be replaced by
the character escapements specified in a \.{PK} or \.{GF} file;
these values may differ by a small amount.
The |char_packets| array is used to store the |pckt_pointer|s for all
different characters among all virtual fonts; pointers to packets from
other font files, e.g., from \.{PK} files, can be stored in the same way.
@<Types...@>=
@!char_offset=-255..max_chars; {|char_pointer| offset for a font}
@!char_pointer=0..max_chars; {index into |char_widths| or similar arrays}
@ @<Glob...@>=
@!char_widths:array[char_pointer] of width_pointer; {width pointers}
@!device
@!char_pixels:array[char_pointer] of pix_value; {character escapements}
ecived @; @/
@!char_packets:array[char_pointer] of pckt_pointer; {packet pointers}
@!n_chars:char_pointer; {first unused position in |char_widths|}
@!n_packets:char_pointer; {first unused position in |char_packets|}
@ @<Set init...@>=
n_chars:=0; n_packets:=0;
@ The current number of known fonts is |nf|; each known font has an
internal number |f|, where |0<=f<nf|. For the moment we need for each
known font: |font_check|, |font_scaled|, |font_design|, |font_space|,
|font_name|, |font_bc|, |font_ec|, |font_chars|, and |font_type|.
Here |font_scaled|, |font_design|, and |font_space| are measured in
\.{DVI} units and |font_chars| is of type |char_offset|:
the width pointer for character~|c| of the font is stored in
|char_widths[char_offset+c]| (for |font_bc<=c<=font_ec|).
Lateron we will need additional information depending on the font type:
\.{VF} or real (\.{GF}, \.{PK}, or \.{PXL}).
These data are stored in an array of record structures with a variant
part depending on the font type and defined elsewhere in this program.
@^font types@>
@<Types...@>=
@!f_type=new_font_type..max_font_type; {type of a font}
@!font_number=0..max_fonts;
@ @<Glob...@>=
@!font_data:array[font_number] of font_record; {all data for all fonts}
@!nf:font_number;
@ We use \.{WEB} macros to access the various fields. We will say, e.g.,
|font_name(f)| for the name field of font~|f|, and |font_width(f)(c)|
for the width pointer of character~|c| in font~|f| (this character
exists provided |font_bc(f)<=c<=font_ec(f)| and |font_width(f)(c)>0|).
The actual width of character~|c| in font~|f| is stored in
|widths[font_width(f)(c)]|; the horizontal escapement is given by
|font_pixel(f)(c)|.
@d font_check(#)==font_data[#].check_field {checksum}
@d font_scaled(#)==font_data[#].scaled_field {scaled or `at' size}
@d font_design(#)==font_data[#].design_field {design size}
@d font_space(#)==font_data[#].space_field {boundary between ``small''
and ``large'' spaces}
@d font_name(#)==font_data[#].name_field {area plus name packet}
@d font_bc(#)==font_data[#].bc_field {first character}
@d font_ec(#)==font_data[#].ec_field {last character}
@d font_chars(#)==font_data[#].chars_field {character width offset}
@d font_type(#)==font_data[#].type_field {type of this font}
@d font_width_end(#)==#]
@d font_width(#)==char_widths[font_chars(#)+font_width_end
@d font_pixel(#)==char_pixels[font_chars(#)+font_width_end
@<Types...@>=
@!font_record=packed record@; {all data for one font}
@!check_field:int_32; {checksum}
@!scaled_field:int_31; {scaled size}
@!design_field:int_31; {design size}
@!device
@!space_field:int_32; {boundary between ``small'' and ``large'' spaces}
ecived @;
@!name_field:pckt_pointer; {pointer to area plus name packet}
@!bc_field:eight_bits; {first character}
@!ec_field:eight_bits; {last character}
@!chars_field:char_offset; {character width offset}
@!type_field:f_type; {type of font}
case f_type of@;@/
@<Cases for |font_record|@>@;
end;
@ Here we define the additional |font_data| fields required for a new
font which has been defined but not yet been used.
In order to simplify the \.{web2c} translation the fields in the variant
for |f_type=new_font_type| are accessed through the \.{WEB} macro
|new_font_data|.
@^font types@>@.web2c@>
Actually we need no fields for this record variant (at least for the
moment), but standard \PASCAL\ requires us to declare a field list for
each possible tag type value.
@d new_font_data(#)==font_data[#] {access |new_font_type| variant fields}
@<Cases for |font_record|@>=
new_font_type:
(); {there are no fields}
@ @d invalid_font==max_fonts {used when there is no valid font}
@<Set init...@>=
@!device font_space(invalid_font):=0; @+ ecived @;
nf:=0;
@ The |make_char_packets| function allocates and initializes packet
pointers in the |char_packets| array for all characters in a font and
returns the character packet offset.
@p function make_char_packets(@!f:font_number):char_offset;
var p:char_offset; {the character offset value to be returned}
@!k:char_pointer; {index into |char_packets|}
begin p:=n_packets-font_bc(f);
if font_ec(f)>=max_chars-p then overflow(str_chars,max_chars);
n_packets:=p+font_ec(f)+1;
for k:=p+font_bc(f) to n_packets-1 do char_packets[k]:=invalid_packet;
make_char_packets:=p;
@ In order to read \.{TFM} files the program uses the binary file
variable |tfm_file|.
@<Glob...@>=
@!tfm_file:byte_file; {a \.{TFM} file}
@!tfm_ext:pckt_pointer; {extension for \.{TFM} files}
@!cur_tfm:font_number; {font number of current \.{TFM} file}
@ @<Initialize predefined strings@>=
id4(".")("T")("F")("M")(tfm_ext); {file name extension for \.{TFM} files}
@ If a \.{TFM} file is badly malformed, we say |bad_tfm|; this procedure
gives an error message which refers the user to \.{TFtoPL} and \.{PLtoTF},
and terminates \.{\title}.
@<Error handling...@>=
procedure bad_tfm;
begin print_ln(' ');
print('Bad TFM file'); print_font(cur_tfm); print_ln('!');
@.Bad TFM file@>
abort('Use TFtoPL/PLtoTF to diagnose and correct the problem');
@.Use TFtoPL/PLtoTF@>
@ To prepare |tfm_file| for input we |reset| it.
@<TFM: Open |tfm_file|@>=
reset(tfm_file,cur_name);
@^system dependencies@>
if eof(tfm_file) then
abort('---not loaded, TFM file can''t be opened!');
@.TFM file can\'t be opened@>
cur_tfm:=f {in case |bad_tfm| is called}
@ For some operating systems it may be necessary to close |tfm_file|.
@<TFM: Close |tfm_file|@>=
@ It turns out to be convenient to read four bytes at a time, when we
are inputting from \.{TFM} files. The input goes into global variables
|tfm_b0|, |tfm_b1|, |tfm_b2|, and |tfm_b3|, with |tfm_b0| getting the
first byte and |tfm_b3| the fourth.
@<Glob...@>=
@!tfm_b0,@!tfm_b1,@!tfm_b2,@!tfm_b3: eight_bits; {four bytes input at once}
@ Reading a \.{TFM} file should be done as efficient as possible for a
particular system; on many systems this means that a large number of
bytes from |tfm_file| is read into a buffer and will then be extracted
from that buffer. In order to simplify such system dependent changes
we use the \.{WEB} macro |tfm_byte| to extract the next \.{TFM} byte;
this macro and |eof(tfm_file)| are used only in the |read_tfm_word|
procedure which sets |tfm_b0| through |tfm_b3| to the next four bytes
in the current \.{TFM} file. Here we give simple minded definitions in
terms of standard \PASCAL.
@^system dependencies@>
@^optimization@>
@d tfm_byte(#)==read(tfm_file,#) {read next \.{TFM} byte}
@p procedure read_tfm_word;
begin tfm_byte(tfm_b0); tfm_byte(tfm_b1);
tfm_byte(tfm_b2); tfm_byte(tfm_b3);
if eof(tfm_file) then bad_tfm;
@ Here are three procedures used to check the consistency of font files:
First, the |check_check_sum| procedure compares two check sum values: a
warning is given if they differ and are both non-zero; if the second
value is not zero it replaces the first one.
Next, the |check_design_size| procedure compares two design size
values: a warning is given if they differ by more than a small amount.
Finally, the |check_width| procedure compares two character width
values: a warning is given if they differ.
@p procedure check_check_sum(@!f:font_number;@!c:int_32);
{compare |font_check(f)| with |c|}
begin if (c<>font_check(f))and(c<>0) then
begin
if font_check(f)<>0 then
begin print_ln('---beware: check sums do not agree!');
@.beware: check sums do not agree@>
@.check sums do not agree@>
print_ln(' (',c:1,' vs. ',font_check(f):1,')');
d_print(' ');
mark_error;
end;
font_check(f):=c;
end;
procedure check_design_size(@!f:font_number;@!d:int_32);
{compare |font_design(f)| with |d|}
begin if abs(d-font_design(f))>2 then
begin print_ln('---beware: design sizes do not agree!');
@.beware: design sizes do not agree@>
@.design sizes do not agree@>
print_ln(' (',d:1,' vs. ',font_design(f):1,')');
d_print(' ');
mark_error;
end;
procedure check_width(@!p:width_pointer;w:int_32);
{compare |widths[p]| with |w|}
begin if w<>widths[p] then
begin print_ln('---beware: char widths do not agree!');
@.beware: char widths do not agree@>
@.char widths do not agree@>
print_ln(' (',w:1,' vs. ',widths[p]:1,')');
d_print(' ');
mark_error;
end;
@ When processing a font definition we put the data extracted from the
\.{DVI} or \.{VF} file into the fields of |font_data[nf]| and call
|make_font| to obtain the internal font number for this font.
The function |make_font| determines if this font is already defined and,
if this is not the case, reads the \.{TFM} file.
@p function make_font:font_number;
var f:font_number; {internal font number of this font}
@!k:int_16; {loop index}
@!p:char_pointer; {index into |char_widths|}
@!q:width_pointer; {index into |widths|}
@!bc,@!ec:int_15; {first and last character in this font}
@!lh:int_15; {length of header in four byte words}
@!nw:int_15; {number of words in width table}
@!w:int_32; {a four byte integer}
@<Variables for scaling computation@>@;
begin f:=0;
while (font_name(f)<>font_name(nf))or@|
(font_scaled(f)<>font_scaled(nf)) do incr(f);
d_print(' => ',f:1); print_font(f);
if f<nf then begin check_check_sum(f,font_check(nf));
check_design_size(f,font_design(nf));
d_print(' loaded previously');
end
else @<Define a new font@>;
print_ln('.');
make_font:=f;
@ If no font directory has been specified, \.{\title} is supposed to use
the default \.{TFM} directory, which is a system-dependent place where
the \.{TFM} files for standard fonts are kept.
The string variable |TFM_default_area| contains the name of this area.
@^system dependencies@>
@d TFM_default_area_name=='TeXfonts:' {change this to the correct name}
@d TFM_default_area_name_length=9 {change this to the correct length}
@<Glob...@>=
@!TFM_default_area:packed array[1..TFM_default_area_name_length] of char;
@ @<Set init...@>=
TFM_default_area:=TFM_default_area_name;
@ @<Define a new font@>=
begin if nf=max_fonts then overflow(str_fonts,max_fonts);
font_type(f):=new_font_type;
for k:=1 to TFM_default_area_name_length do
cur_name[k]:=TFM_default_area[k];
make_name(font_name(f),tfm_ext,TFM_default_area_name_length);
@<TFM: Open |tfm_file|@>;
@<TFM: Read past the header data@>;
@<TFM: Store character-width indices@>;
@<TFM: Read and convert the width values@>;
@<TFM: Convert character-width indices to character-width pointers@>;
@<TFM: Close |tfm_file|@>;
d_print(' loaded at ',font_scaled(f):1,' DVI units');
incr(nf);
end
@ @<Glob...@>=
@!tfm_conv:real; {\.{DVI} units per absolute \.{TFM} unit}
@ We will use the following \.{WEB} macros to construct integers from
two or four of the four bytes read by |read_tfm_word|.
@^system dependencies@>
@d tfm_b01(#)== {|tfm_b0..tfm_b1| as non-negative integer}
if tfm_b0>127 then bad_tfm
else #:=tfm_b0*256+tfm_b1
@d tfm_b23(#)== {|tfm_b2..tfm_b3| as non-negative integer}
if tfm_b2>127 then bad_tfm
else #:=tfm_b2*256+tfm_b3
@d tfm_squad(#)== {|tfm_b0..tfm_b3| as signed integer}
if tfm_b0<128 then #:=((tfm_b0*256+tfm_b1)*256+tfm_b2)*256+tfm_b3
else #:=(((tfm_b0-256)*256+tfm_b1)*256+tfm_b2)*256+tfm_b3
@d tfm_uquad== {|tfm_b0..tfm_b3| as unsigned integer}
(((tfm_b0*256+tfm_b1)*256+tfm_b2)*256+tfm_b3)
@<TFM: Read past the header data@>=
read_tfm_word; tfm_b23(lh);
read_tfm_word; tfm_b01(bc); tfm_b23(ec);
if ec<bc then
begin bc:=1; ec:=0;
end
else if ec>255 then bad_tfm;
read_tfm_word; tfm_b01(nw);
if (nw=0)or(nw>256) then bad_tfm;
for k:=-2 to lh do
begin read_tfm_word;
if k=1 then begin tfm_squad(w); check_check_sum(f,w);
end
else if k=2 then begin if tfm_b0>127 then bad_tfm;
check_design_size(f,round(tfm_conv*tfm_uquad));
end;
end
@ The width indices for the characters are stored in positions |n_chars|
through |n_chars-bc+ec+1| of the |char_widths| array; if characters on
either end of the range |bc..ec| do not exist, they are ignored and the
range is adjusted accordingly.
@<TFM: Store character-width indices@>=
read_tfm_word;
while (tfm_b0=0)and(bc<=ec) do
begin incr(bc); read_tfm_word;
end;
font_bc(f):=bc; font_chars(f):=n_chars-bc;
if ec>=max_chars-font_chars(f) then overflow(str_chars,max_chars);
for k:=bc to ec do
begin char_widths[n_chars]:=tfm_b0; incr(n_chars); read_tfm_word;
end;
while (char_widths[n_chars-1]=0)and(ec>=bc) do
begin decr(n_chars); decr(ec);
end;
font_ec(f):=ec
@ The most important part of |make_font| is the width computation, which
involves multiplying the relative widths in the \.{TFM} file by the
scaling factor in the \.{DVI} file. This fixed-point multiplication
must be done with precisely the same accuracy by all \.{DVI}-reading programs,
in order to validate the assumptions made by \.{DVI}-writing programs
like \TeX82.
Let us therefore summarize what needs to be done. Each width in a \.{TFM}
file appears as a four-byte quantity called a |fix_word|. A |fix_word|
whose respective bytes are $(a,b,c,d)$ represents the number
$$x=\left\{\vcenter{\halign{$#$,\hfil\qquad&if $#$\hfil\cr
b\cdot2^{-4}+c\cdot2^{-12}+d\cdot2^{-20}&a=0;\cr
-16+b\cdot2^{-4}+c\cdot2^{-12}+d\cdot2^{-20}&a=255.\cr}}\right.$$
(No other choices of $a$ are allowed, since the magnitude of a \.{TFM}
dimension must be less than 16.) We want to multiply this quantity by the
integer~|z|, which is known to be less than $2^{27}$.
If $|z|<2^{23}$, the individual multiplications $b\cdot z$, $c\cdot z$,
$d\cdot z$ cannot overflow; otherwise we will divide |z| by 2, 4, 8, or
16, to obtain a multiplier less than $2^{23}$, and we can compensate for
this later. If |z| has thereby been replaced by $|z|^\prime=|z|/2^e$, let
$\beta=2^{4-e}$; we shall compute
$$\lfloor(b+c\cdot2^{-8}+d\cdot2^{-16})\,z^\prime/\beta\rfloor$$ if $a=0$,
or the same quantity minus $\alpha=2^{4+e}z^\prime$ if $a=255$.
This calculation must be
done exactly, for the reasons stated above; the following program does the
job in a system-independent way, assuming that arithmetic is exact on
numbers less than $2^{31}$ in magnitude.
@ Since the dimensions extracted from a \.{VF} file have to be scaled in
exactly the same way as the \.{TFM} width values, we shall use the same
code in both cases; thus these computations need to be optimized for a
particular system only once.
@^system dependencies@>
@^optimization@>
@<Variables for scaling computation@>=
@!z:int_32; {multiplier}
@!alpha:int_32; {correction for negative values}
@!beta:int_15; {divisor}
@ @<Replace |z| by $|z|^\prime$ and compute $\alpha,\beta$@>=
alpha:=16;
while z>=@'40000000 do
begin z:=z div 2; alpha:=alpha+alpha;
end;
beta:=256 div alpha; alpha:=alpha*z
@ @<Scaled value of |tfm_b1..tfm_b3|@>=
(((((tfm_b3*z)div@'400)+(tfm_b2*z))div@'400)+(tfm_b1*z))div beta
@ The first width value, which indicates that a character does not exist
and which must vanish, is converted to the width pointer value zero;
the other width values are scaled by |font_scaled(f)| and converted
to width pointers by |make_width|. The resulting width pointers are
stored temporarily in the |char_widths| array, following the with indices.
@<TFM: Read and convert the width values@>=
if nw-1>max_chars-n_chars then overflow(str_chars,max_chars);
if (tfm_b0<>0)or(tfm_b1<>0)or(tfm_b2<>0)or(tfm_b3<>0) then bad_tfm
else char_widths[n_chars]:=0;
z:=font_scaled(f);
@!device font_space(f):=z div 6; {this is a 3-unit ``thin space''}
ecived @;
@<Replace |z|...@>;
for k:=1 to nw-1 do begin read_tfm_word;
w:=@<Scaled value of |tfm_b1..tfm_b3|@>;
if tfm_b0>0 then if tfm_b0<255 then bad_tfm
else Decr(w)(alpha);
char_widths[n_chars+k]:=make_width(w);
end
@ We simply translate the width indices into width pointers.
@<TFM: Convert character-width indices to character-width pointers@>=
for p:=font_chars(f)+bc to n_chars-1 do
begin q:=char_widths[n_chars+char_widths[p]]; char_widths[p]:=q;
@!device char_pixels[p]:=pix_widths[q]; @+ ecived @; @/
end
@ The global variable |cur_fnt| contains the internal font number of
the currently selected font or the value |invalid_font| if no font has
been selected.
@<Glob...@>=
@!cur_fnt:font_number; {the currently selected font}
@* Low-level DVI input routines.
The program uses the binary file variable |dvi_file| for its main input
file; |dvi_loc| is the number of the byte about to be read next from
|dvi_file|.
@<Glob...@>=
@!dvi_file:byte_file; {the stuff we are \.{\title}ing}
@!dvi_loc:int_32; {where we are about to look, in |dvi_file|}
@ If the \.{DVI} file is badly malformed, we say |bad_dvi|; this
procedure gives an error message which refers the user to \.{DVItype},
and terminates \.{\title}.
@<Error handling...@>=
procedure bad_dvi;
begin print_ln(' ');
print_ln('Bad DVI file: loc=',dvi_loc:1,'!');
@.Bad DVI file@>
print(' Use DVItype with output level');
@.Use DVItype@>
if random_reading then print('=4') @+ else print('<4');
abort('to diagnose the problem');
@ To prepare |dvi_file| for input, we |reset| it.
@<Open input file(s)@>=
reset(dvi_file); {prepares to read packed bytes from |dvi_file|}
dvi_loc:=0;
@ For some operating systems it may be necessary to close |dvi_file|.
@<Close input file(s)@>=
@ Reading the \.{DVI} file should be done as efficient as possible for a
particular system; on many systems this means that a large number of
bytes from |dvi_file| is read into a buffer and will then be extracted
from that buffer. In order to simplify such system dependent changes
we use a pair of \.{WEB} macros: |dvi_byte| extracts the next \.{DVI}
byte and |dvi_eof| is |true| if we have reached the end of the \.{DVI}
file. Here we give simple minded definitions for these macros in terms
of standard \PASCAL.
@^system dependencies@>
@^optimization@>
@d dvi_eof == eof(dvi_file) {has the \.{DVI} file been exhausted?}
@d dvi_byte(#) ==
if dvi_eof then bad_dvi
else read(dvi_file,#) {obtain next \.{DVI} byte}
@ Next we come to the routines that are used only if |random_reading| is
|true|. The driver program below needs two such routines: |dvi_length| should
compute the total number of bytes in |dvi_file|, possibly also
causing |eof(dvi_file)| to be true; and |dvi_move(n)| should position
|dvi_file| so that the next |dvi_byte| will read byte |n|, starting with
|n=0| for the first byte in the file.
@^system dependencies@>
Such routines are, of course, highly system dependent. They are implemented
here in terms of two assumed system routines called |set_pos| and |cur_pos|.
The call |set_pos(f,n)| moves to item |n| in file |f|, unless |n| is
negative or larger than the total number of items in |f|; in the latter
case, |set_pos(f,n)| moves to the end of file |f|.
The call |cur_pos(f)| gives the total number of items in |f|, if
|eof(f)| is true; we use |cur_pos| only in such a situation.
@p function dvi_length:int_32;
begin set_pos(dvi_file,-1); dvi_length:=cur_pos(dvi_file);
procedure dvi_move(n:int_32);
begin set_pos(dvi_file,n); dvi_loc:=n;
@ We need seven simple functions to read the next byte or bytes
from |dvi_file|.
@p function dvi_sbyte:int_8; {returns the next byte, signed}
@!begin_byte(dvi_byte); incr(dvi_loc); comp_sbyte(dvi_sbyte);
function dvi_ubyte:int_8u; {returns the next byte, unsigned}
@!begin_byte(dvi_byte); incr(dvi_loc); comp_ubyte(dvi_ubyte);
function dvi_spair:int_16; {returns the next two bytes, signed}
@!begin_pair(dvi_byte); Incr(dvi_loc)(2); comp_spair(dvi_spair);
function dvi_upair:int_16u; {returns the next two bytes, unsigned}
@!begin_pair(dvi_byte); Incr(dvi_loc)(2); comp_upair(dvi_upair);
function dvi_strio:int_24; {returns the next three bytes, signed}
@!begin_trio(dvi_byte); Incr(dvi_loc)(3); comp_strio(dvi_strio);
function dvi_utrio:int_24u; {returns the next three bytes, unsigned}
@!begin_trio(dvi_byte); Incr(dvi_loc)(3); comp_utrio(dvi_utrio);
function dvi_squad:int_32; {returns the next four bytes, signed}
@!begin_quad(dvi_byte); Incr(dvi_loc)(4); comp_squad(dvi_squad);
@ Three other functions are used in cases where a four byte integer
(which is always signed) must have a non-negative value, a positive
value, or is a pointer which must be either positive or |=-1|.
@p function dvi_uquad:int_31; {result must be non-negative}
var x:int_32;
begin x:=dvi_squad; if x<0 then bad_dvi
else dvi_uquad:=x;
function dvi_pquad:int_31; {result must be positive}
var x:int_32;
begin x:=dvi_squad; if x<=0 then bad_dvi
else dvi_pquad:=x;
function dvi_pointer:int_32; {result must be positive or |=-1|}
var x:int_32;
begin x:=dvi_squad; if (x<=0)and(x<>-1) then bad_dvi
else dvi_pointer:=x;
@ Given the structure of the \.{DVI} commands it is fairly obvious
that their interpretation consists of two steps: First zero to four
bytes are read in order to obtain the value of the first parameter
(e.g., zero bytes for |set_char_0|, four bytes for |set4|); then,
depending on the command class, a specific action is performed (e.g.,
typeset a character but don't move the reference point for |put1..put4|).
The \.{DVItype} program uses large case statements for both steps;
unfortunately some \PASCAL\ compilers fail to implement large case
statements efficiently -- in particular those as the one used in the
|first_par| function of \.{DVItype}. Here we use a pair of look up tables:
|dvi_par| determines how to obtain the value of the first parameter, and
|dvi_cl| determines the command class.
A slight complication arises from the fact that we want to decompose the
character code of each character to be typset into a residue
|0<=char_res<256| and extension: |char_code=char_res+256*char_ext|;
the \.{TFM} widths as well as the pixel widths for a given resolution
are the same for all characters in a font with the same residue.
@d two_cases(#)==#,#+1
@d three_cases(#)==#,#+1,#+2
@d five_cases(#)==#,#+1,#+2,#+3,#+4
@ First we define the values used as array elements of |dvi_par|; we
distinguish between pure numbers and dimensions because dimensions read
from a \.{VF} file must be scaled.
@d char_par=0 {character for \\{set} and |put|}
@d no_par=1 {no parameter}
@d dim1_par=2 {one-byte signed dimension}
@d num1_par=3 {one-byte unsigned number}
@d dim2_par=4 {two-byte signed dimension}
@d num2_par=5 {two-byte unsigned number}
@d dim3_par=6 {three-byte signed dimension}
@d num3_par=7 {three-byte unsigned number}
@d dim4_par=8 {four-byte signed dimension}
@d num4_par=9 {four-byte signed number}
@d numu_par=10 {four-byte non-negative number}
@d rule_par=11 {dimensions for |set_rule| and |put_rule|}
@d fnt_par=12 {font for |fnt_num| commands}
@d max_par=12 {largest possible value}
@<Types...@>=
@!cmd_par=char_par..max_par;
@ Here we declare the array |dvi_par|.
@<Globals...@>=
@!dvi_par:packed array [eight_bits] of cmd_par;
@ And here we initialize it.
@<Set init...@>=
for i:=0 to put1+3 do dvi_par[i]:=char_par;@/
for i:=nop to 255 do dvi_par[i]:=no_par;@/
dvi_par[set_rule]:=rule_par; dvi_par[put_rule]:=rule_par;@/
dvi_par[right1]:=dim1_par; dvi_par[right1+1]:=dim2_par;
dvi_par[right1+2]:=dim3_par; dvi_par[right1+3]:=dim4_par;@/
for i:=fnt_num_0 to fnt_num_0+63 do dvi_par[i]:=fnt_par;@/
dvi_par[fnt1]:=num1_par; dvi_par[fnt1+1]:=num2_par;
dvi_par[fnt1+2]:=num3_par; dvi_par[fnt1+3]:=num4_par;@/
dvi_par[xxx1]:=num1_par; dvi_par[xxx1+1]:=num2_par;
dvi_par[xxx1+2]:=num3_par; dvi_par[xxx1+3]:=numu_par;@/
for i:=0 to 3 do
begin dvi_par[i+w1]:=dvi_par[i+right1];
dvi_par[i+x1]:=dvi_par[i+right1];
dvi_par[i+down1]:=dvi_par[i+right1];
dvi_par[i+y1]:=dvi_par[i+right1];
dvi_par[i+z1]:=dvi_par[i+right1];
dvi_par[i+fnt_def1]:=dvi_par[i+fnt1];
end;
@ Next we define the values used as array elements of |dvi_cl|;
several \.{DVI} commands (e.g., |nop|, |bop|, |eop|, |pre|, |post|) will
allways be treated separately and are therfore assigned to the invalid
class here.
@d char_cl=0
@d rule_cl=char_cl+1
@d xxx_cl=char_cl+2
@d push_cl=3
@d pop_cl=4
@d w0_cl=5
@d x0_cl=w0_cl+1
@d right_cl=w0_cl+2
@d w_cl=w0_cl+3
@d x_cl=w0_cl+4
@d y0_cl=10
@d z0_cl=y0_cl+1
@d down_cl=y0_cl+2
@d y_cl=y0_cl+3
@d z_cl=y0_cl+4
@d fnt_cl=15
@d fnt_def_cl=16
@d invalid_cl=17
@d max_cl=invalid_cl {largest possible value}
@<Types...@>=
@!cmd_cl=char_cl..max_cl;
@ Here we declare the array |dvi_cl|.
@<Globals...@>=
@!dvi_cl:packed array [eight_bits] of cmd_cl;
@ And here we initialize it.
@<Set init...@>=
for i:=set_char_0 to put1+3 do dvi_cl[i]:=char_cl;
dvi_cl[set_rule]:=rule_cl; dvi_cl[put_rule]:=rule_cl;@/
dvi_cl[nop]:=invalid_cl;
dvi_cl[bop]:=invalid_cl; dvi_cl[eop]:=invalid_cl;@/
dvi_cl[push]:=push_cl; dvi_cl[pop]:=pop_cl;@/
dvi_cl[w0]:=w0_cl; dvi_cl[x0]:=x0_cl;@/
dvi_cl[y0]:=y0_cl; dvi_cl[z0]:=z0_cl;@/
for i:=0 to 3 do
begin dvi_cl[i+right1]:=right_cl;
dvi_cl[i+w1]:=w_cl;
dvi_cl[i+x1]:=x_cl;@/
dvi_cl[i+down1]:=down_cl;
dvi_cl[i+y1]:=y_cl;
dvi_cl[i+z1]:=z_cl;@/
dvi_cl[i+xxx1]:=xxx_cl;
dvi_cl[i+fnt_def1]:=fnt_def_cl;
end;
for i:=fnt_num_0 to fnt1+3 do dvi_cl[i]:=fnt_cl;
for i:=pre to 255 do dvi_cl[i]:=invalid_cl;
@ A few small arrays are used to generate \.{DVI} commands.
@<Glob...@>=
@!dvi_char_cmd:array[boolean] of eight_bits; {|put1| and |set1|}
@!dvi_rule_cmd:array[boolean] of eight_bits; {|put_rule| and |set_rule|}
@!dvi_right_cmd:array[right_cl..x_cl] of eight_bits; {|right1|, |w1|, and |x1|}
@!dvi_down_cmd:array[down_cl..z_cl] of eight_bits; {|down1|, |y1|, and |z1|}
@ @<Set init...@>=
dvi_char_cmd[false]:=put1;
dvi_char_cmd[true]:=set1;@/
dvi_rule_cmd[false]:=put_rule;
dvi_rule_cmd[true]:=set_rule;@/
dvi_right_cmd[right_cl]:=right1;
dvi_right_cmd[w_cl]:=w1;
dvi_right_cmd[x_cl]:=x1;@/
dvi_down_cmd[down_cl]:=down1;
dvi_down_cmd[y_cl]:=y1;
dvi_down_cmd[z_cl]:=z1;
@ The global variables |cur_cmd|, |cur_parm| and |cur_class| are used
for the current \.{DVI} command, its first parameter (if any), and its
command class respectively.
@<Glob...@>=
@!cur_cmd:eight_bits; {current \.{DVI} command byte}
@!cur_parm:int_32; {its first parameter (if any)}
@!cur_class:cmd_cl; {its class}
@ When typesetting a character, |cur_ext| and |cur_res| are its
extension and residue; when typesetting a character or rule, the boolean
variable |cur_upd| is |true| for \\{set} commands, |false| for |put|
commands.
@<Glob...@>=
@!cur_ext:int_24; {the current character extension}
@!cur_res:int_8u; {the current character residue}
@!cur_wp:width_pointer; {width pointer of the current character}
@!cur_upd:boolean; {is this a \\{set} or |set_rule| command ?}
@!cur_v_dimen:int_32; {a vertical dimension}
@!cur_h_dimen:int_32; {a horizontal dimension}
@ The |dvi_first_par| procedure first reads \.{DVI} command bytes into
|cur_cmd| until |cur_cmd<>nop|; then |cur_parm| is set to the value of
the first parameter (if any) and |cur_class| to the command class.
@p procedure dvi_first_par;
begin repeat cur_cmd:=dvi_ubyte;
until cur_cmd<>nop; {skip over |nop|s}
case dvi_par[cur_cmd] of
char_par: if cur_cmd<set1 then
begin cur_ext:=0; cur_res:=cur_cmd; cur_upd:=true
end
else begin cur_upd:=(cur_cmd<put1);
case cur_cmd-dvi_char_cmd[cur_upd] of
0: cur_ext:=0;
1: cur_ext:=dvi_ubyte;
2: cur_ext:=dvi_upair;
3: cur_ext:=dvi_strio;
end;
cur_res:=dvi_ubyte;
end;
no_par: do_nothing;
dim1_par: cur_parm:=dvi_sbyte;
num1_par: cur_parm:=dvi_ubyte;
dim2_par: cur_parm:=dvi_spair;
num2_par: cur_parm:=dvi_upair;
dim3_par: cur_parm:=dvi_strio;
num3_par: cur_parm:=dvi_utrio;
two_cases(dim4_par): cur_parm:=dvi_squad; {|dim4_par| and |num4_par|}
numu_par: cur_parm:=dvi_uquad;
rule_par:
begin cur_v_dimen:=dvi_squad; cur_h_dimen:=dvi_squad;
cur_upd:=(cur_cmd=set_rule);
end;
fnt_par:cur_parm:=cur_cmd-fnt_num_0;
cur_class:=dvi_cl[cur_cmd];
@ The global variable |dvi_nf| is used for the number of different
\.{DVI} fonts defined so far; their external font numbers (as extracted
from the \.{DVI} file) are stored in the array |dvi_e_fnts|, the
corresponding internal font numbers used internally by \.{\title} are
stored in the array |dvi_i_fnts|.
@<Glob...@>=
@!dvi_e_fnts:array[font_number] of int_32; {external font numbers}
@!dvi_i_fnts:array[font_number] of font_number; {corresponding
internal font numbers}
@!dvi_nf:font_number; {number of \.{DVI} fonts defined so far}
@ @<Set ini...@>=
dvi_nf:=0;
@ The |dvi_font| procedure sets |cur_fnt| to the internal font number
corresponding to the external font number |cur_parm| (or aborts the
program if such a font was never defined).
@p procedure dvi_font; {computes |cur_fnt| corresponding to |cur_parm|}
var f:font_number; {where the font is sought}
begin @<DVI: Locate font |cur_parm|@>;
if f=dvi_nf then bad_dvi;
cur_fnt:=dvi_i_fnts[f];
@ @<DVI: Locate font |cur_parm|@>=
f:=0; dvi_e_fnts[dvi_nf]:=cur_parm;
while cur_parm<>dvi_e_fnts[f] do incr(f)
@ Finally the |dvi_do_font| procedure is called when one of the command
|fnt_def1..fnt_def4| and its first parameter have been read from the
\.{DVI} file; the argument indicates whether this should be the second
definition of the font (|true|) or not (|false|).
@p procedure dvi_do_font(@!second:boolean);
var f:font_number; {where the font is sought}
@!k:int_15; {general purpose variable}
begin print('DVI: font ',cur_parm:1);
@<DVI: Locate font |cur_parm|@>;
if (f=dvi_nf)=second then bad_dvi;
font_check(nf):=dvi_squad;
font_scaled(nf):=dvi_pquad;
font_design(nf):=dvi_pquad;
k:=dvi_ubyte; pckt_room(1); append_byte(k);
Incr(k)(dvi_ubyte); pckt_room(k);
while k>0 do begin append_byte(dvi_ubyte); decr(k);
end;
font_name(nf):=make_packet; {the font area plus name}
dvi_i_fnts[dvi_nf]:=make_font;
if not second then
begin if dvi_nf=max_fonts then overflow(str_fonts,max_fonts);
incr(dvi_nf);
end
else if dvi_i_fnts[f]<>dvi_i_fnts[dvi_nf] then bad_dvi;
@* Low-level VF input routines.
The program uses the binary file variable |vf_file| for input from \.{VF}
files; |vf_loc| is the number of the byte about to be read next from
|vf_file|.
@<Glob...@>=
@!vf_file:byte_file; {a \.{VF} file}
@!vf_loc:int_32; {where we are about to look, in |vf_file|}
@!vf_limit:int_32; {value of |vf_loc| at end of a character packet}
@!vf_ext:pckt_pointer; {extension for \.{VF} files}
@!cur_vf:font_number; {font number of current \.{VF} file}
@ @<Initialize predefined strings@>=
id3(".")("V")("F")(vf_ext); {file name extension for \.{VF} files}
@ If a \.{VF} file is badly malformed, we say |bad_vf|; this procedure
gives an error message which refers the user to \.{VFtoVP} and \.{VPtoVF},
and terminates \.{\title}.
@<Error handling...@>=
procedure bad_vf;
begin print_ln(' ');
print('Bad VF file'); print_font(cur_vf); print_ln(': loc=',vf_loc:1,'!');
@.Bad VF file@>
abort('Use VFtoVP/VPtoVF to diagnose and correct the problem');
@.Use VFtoVP/VPtoVF@>
@ To prepare |vf_file| for input we |reset| it.
@<VF: Open |vf_file| or |goto not_found|@>=
reset(vf_file,cur_name);
@^system dependencies@>
if eof(vf_file) then
goto not_found;
vf_loc:=0
@ For some operating systems it may be necessary to close |vf_file|.
@<VF: Close |vf_file|@>=
@ Reading a \.{VF} file should be done as efficient as possible for a
particular system; on many systems this means that a large number of
bytes from |vf_file| is read into a buffer and will then be extracted
from that buffer. In order to simplify such system dependent changes
we use a pair of \.{WEB} macros: |vf_byte| extracts the next \.{VF}
byte and |vf_eof| is |true| if we have reached the end of the \.{VF}
file. Here we give simple minded definitions for these macros in terms
of standard \PASCAL.
@^system dependencies@>
@^optimization@>
@d vf_eof == eof(vf_file) {has the \.{VF} file been exhausted?}
@d vf_byte(#) ==
if vf_eof then bad_vf
else read(vf_file,#) {obtain next \.{VF} byte}
@ We need several simple functions to read the next byte or bytes
from |vf_file|.
@p function vf_ubyte:int_8u; {returns the next byte, unsigned}
@!begin_byte(vf_byte); incr(vf_loc); comp_ubyte(vf_ubyte);
function vf_upair:int_16u; {returns the next two bytes, unsigned}
@!begin_pair(vf_byte); Incr(vf_loc)(2); comp_upair(vf_upair);
function vf_strio:int_24; {returns the next three bytes, signed}
@!begin_trio(vf_byte); Incr(vf_loc)(3); comp_strio(vf_strio);
function vf_utrio:int_24u; {returns the next three bytes, unsigned}
@!begin_trio(vf_byte); Incr(vf_loc)(3); comp_utrio(vf_utrio);
function vf_squad:int_32; {returns the next four bytes, signed}
@!begin_quad(vf_byte); Incr(vf_loc)(4); comp_squad(vf_squad);
@ All dimensions in a \.{VF} file, except the design sizes of a virtual
font and its local fonts, are |fix_word|s that must be scaled in exactly
the same way as the character widths from a \.{TFM} file; we can use the
same code, but this time |z|, |alpha|, and |beta| are global variables.
@<Glob...@>=
@<Variables for scaling computation@>@;
@ We need five functions to read the next byte or bytes and convert a
|fix_word| to a scaled dimension.
@p function vf_fix1:int_32; {returns the next byte as scaled value}
var x:int_32; {accumulator}
begin vf_byte(tfm_b3); incr(vf_loc);
if tfm_b3>127 then tfm_b1:=255 @+ else tfm_b1:=0;
tfm_b2:=tfm_b1;
x:=@<Scaled value of |tfm_b1..tfm_b3|@>;
if tfm_b1>127 then Decr(x)(alpha);
vf_fix1:=x;
function vf_fix2:int_32; {returns the next two bytes as scaled value}
var x:int_32; {accumulator}
begin vf_byte(tfm_b2); vf_byte(tfm_b3); Incr(vf_loc)(2);
if tfm_b2>127 then tfm_b1:=255 @+ else tfm_b1:=0;
x:=@<Scaled value of |tfm_b1..tfm_b3|@>;
if tfm_b1>127 then Decr(x)(alpha);
vf_fix2:=x;
function vf_fix3:int_32; {returns the next three bytes as scaled value}
var x:int_32; {accumulator}
begin vf_byte(tfm_b1); vf_byte(tfm_b2); vf_byte(tfm_b3);
Incr(vf_loc)(3);@/
x:=@<Scaled value of |tfm_b1..tfm_b3|@>;
if tfm_b1>127 then Decr(x)(alpha);
vf_fix3:=x;
function vf_fix3u:int_32; {returns the next three bytes as scaled value}
begin vf_byte(tfm_b1); vf_byte(tfm_b2); vf_byte(tfm_b3);
Incr(vf_loc)(3);@/
vf_fix3u:=@<Scaled value of |tfm_b1..tfm_b3|@>;
function vf_fix4:int_32; {returns the next four bytes as scaled value}
var x:int_32; {accumulator}
begin vf_byte(tfm_b0); vf_byte(tfm_b1); vf_byte(tfm_b2); vf_byte(tfm_b3);
Incr(vf_loc)(4);@/
x:=@<Scaled value of |tfm_b1..tfm_b3|@>;
if tfm_b0>0 then
if tfm_b0<255 then bad_vf @+ else Decr(x)(alpha);
vf_fix4:=x;
@ Three other functions are used in cases where the result must have a
non-negative value or a positive value.
@p function vf_uquad:int_31; {result must be non-negative}
var x:int_32;
begin x:=vf_squad; if x<0 then bad_vf @+ else vf_uquad:=x;
function vf_pquad:int_31; {result must be positive}
var x:int_32;
begin x:=vf_squad; if x<=0 then bad_vf @+ else vf_pquad:=x;
function vf_fixp:int_31; {result must be positive}
var x:int_32; {accumulator}
begin vf_byte(tfm_b0); vf_byte(tfm_b1); vf_byte(tfm_b2); vf_byte(tfm_b3);
Incr(vf_loc)(4);@/
x:=@<Scaled value of |tfm_b1..tfm_b3|@>;
if tfm_b0>0 then bad_vf;
vf_fixp:=x;
@ The |vf_first_par| procedure first reads a \.{VF} command byte into
|cur_cmd|; then |cur_parm| is set to the value of the first parameter
(if any) and |cur_class| to the command class.
@d set_cur_wp == {set |cur_wp| to the char's width pointer}
cur_wp:=invalid_width;
if cur_fnt<>invalid_font then
if (cur_res>=font_bc(cur_fnt))and(cur_res<=font_ec(cur_fnt)) then
cur_wp:=font_width(cur_fnt)(cur_res)
@p procedure vf_first_par;
begin cur_cmd:=vf_ubyte;
case dvi_par[cur_cmd] of
char_par:
begin if cur_cmd<set1 then
begin cur_ext:=0; cur_res:=cur_cmd; cur_upd:=true
end
else begin cur_upd:=(cur_cmd<put1);
case cur_cmd-dvi_char_cmd[cur_upd] of
0: cur_ext:=0;
1: cur_ext:=vf_ubyte;
2: cur_ext:=vf_upair;
3: cur_ext:=vf_strio;
end;
cur_res:=vf_ubyte;
end;
set_cur_wp; if cur_wp=invalid_width then bad_vf;
end;
no_par: do_nothing;
dim1_par: cur_parm:=vf_fix1;
num1_par: cur_parm:=vf_ubyte;
dim2_par: cur_parm:=vf_fix2;
num2_par: cur_parm:=vf_upair;
dim3_par: cur_parm:=vf_fix3;
num3_par: cur_parm:=vf_utrio;
dim4_par: cur_parm:=vf_fix4;
num4_par: cur_parm:=vf_squad;
numu_par: cur_parm:=vf_uquad;
rule_par:
begin cur_v_dimen:=vf_fix4; cur_h_dimen:=vf_fix4;
cur_upd:=(cur_cmd=set_rule);
end;
fnt_par:cur_parm:=cur_cmd-fnt_num_0;
cur_class:=dvi_cl[cur_cmd];
@ Here we define the additional |font_data| fields required for virtual
fonts; |font_vf_fnt(f)| is the default font for character packets from
virtual font~|f|, |font_vf_packet(f)(c)| is the character packet for
character~|c| from virtual font~|f|.
In order to simplify the \.{web2c} translation the fields in the variant
for |f_type=vf_font_type| are accessed through the \.{WEB} macro
|vf_font_data|.
@^font types@>@.web2c@>
@d vf_font_data(#)==font_data[#] {access |vf_font_type| variant fields}
@d font_vf_chars(#)==vf_font_data(#).vf_chars_field {character packet offset}
@d font_vf_fnt(#)==vf_font_data(#).vf_fnt_field {font number of default font}
@d font_vf_packet_end(#)==#]
@d font_vf_packet(#)==char_packets[font_vf_chars(#)+font_vf_packet_end
@<Cases for |font_record|@>=
vf_font_type:
(@!vf_chars_field:char_offset; {character packet offset}
@!vf_fnt_field:font_number); {font number of default font}
@ The global variable |vf_nf| is used for the number of different local
fonts defined in a \.{VF} file so far; their external font numbers (as
extracted from the \.{VF} file) are stored in the array |vf_e_fnts|, the
corresponding internal font numbers used internally by \.{\title} are
stored in the array |vf_i_fnts|.
@<Glob...@>=
@!vf_e_fnts:array[font_number] of int_32; {external font numbers}
@!vf_i_fnts:array[font_number] of font_number; {corresponding
internal font numbers}
@!vf_nf:font_number; {number of local fonts defined so far}
@!lcl_nf:font_number; {largest |vf_nf| value for any \.{VF} file}
@ @<Set init...@>=
lcl_nf:=0;
@ The |vf_font| procedure sets |cur_fnt| to the internal font number
corresponding to the external font number |cur_parm| (or aborts the
program if such a font was never defined).
@p procedure vf_font; {computes |cur_fnt| corresponding to |cur_parm|}
var f:font_number; {where the font is sought}
begin @<VF: Locate font |cur_parm|@>;
if f=vf_nf then bad_vf;
cur_fnt:=vf_i_fnts[f];
@ @<VF: Locate font |cur_parm|@>=
f:=0; vf_e_fnts[vf_nf]:=cur_parm;
while cur_parm<>vf_e_fnts[f] do incr(f)
@ Finally the |vf_do_font| procedure is called when one of the command
|fnt_def1..fnt_def4| and its first parameter have been read from the
\.{VF} file.
@p procedure vf_do_font;
var f:font_number; {where the font is sought}
@!k:int_15; {general purpose variable}
begin print('VF: font ',cur_parm:1);@/
@<VF: Locate font |cur_parm|@>;
if f<>vf_nf then bad_vf;
font_check(nf):=vf_squad;
font_scaled(nf):=vf_fixp;
font_design(nf):=round(tfm_conv*vf_pquad);
k:=vf_ubyte; pckt_room(1); append_byte(k);
Incr(k)(vf_ubyte); pckt_room(k);
while k>0 do begin append_byte(vf_ubyte); decr(k);
end;
font_name(nf):=make_packet; {the font area plus name}
vf_i_fnts[vf_nf]:=make_font;
if vf_nf=lcl_nf then
if lcl_nf=max_fonts then overflow(str_fonts,max_fonts)
else incr(lcl_nf);
incr(vf_nf);
@* Reading VF files.
First we need a few global variables: |cur_vf_ext| and |cur_vf_res|
are the extension and residue of the character we are building a packet
for; |vf_fnt| is the current font of the packet being built (whereas
|cur_fnt| is the current font of the packet read from the \.{VF} file).
@<Glob...@>=
@!cur_vf_ext:int_24; {character extension for the current packet}
@!cur_vf_res:int_8u; {character residue for the current packet}
@!cur_vf_wp:width_pointer; {width pointer for the current packet}
@!vf_fnt:font_number; {current font in the current packet}
@ The \.{VF} format specifies that the interpretation of each packet
begins with |w=x=y=z=0|; any |w0|, |x0|, |y0|, or |z0| command using
these initial values will be ignored.
@<Types...@>=
@!vf_state=array[0..1,0..1] of boolean; {state of |w|, |x|, |y|, and |z|}
@ As implied by the \.{VF} format the \.{DVI} commands read from the
\.{VF} file are enclosed by |push| and |pop|; as we read \.{DVI}
commands and append them to |byte_mem|, we perform a set of
transformations in order to simplify the resulting packet: Let |zero| be
any of the commands |put|, |put_rule|, |fnt_num|, |fnt|, or |xxx| which
all leave the current position on the page unchanged, let |move| be any
of the horizontal or vertical movement commands |right1..z4|, and let
|any| be any sequence of commands containing |push| and |pop| in
properly nested pairs; whenever possible we apply one of the following
transformation rules: $$\def\n#1:{\hbox to 3cm{\hfil#1:}}
\leqalignno{
\hbox{|push| |zero|}&\RA\hbox{|zero| |push|}&\n1:\cr
\hbox{|move| |pop|}&\RA\hbox{|pop|}&\n2:\cr
\hbox{|push| |pop|}&\RA{}&\n3:\cr
\hbox{|push| |set_char| |pop|}&\RA\hbox{|put|}&\n4a:\cr
\hbox{|push| \\{set} |pop|}&\RA\hbox{|put|}&\n4b:\cr
\hbox{|push| |set_rule| |pop|}&\RA\hbox{|put_rule|}&\n4c:\cr
\hbox{|push| |push| |any| |pop|}&\RA\hbox{|push| |any| |pop| |push|}&\n5:\cr
\hbox{|push| |any| |pop| |pop|}&\RA\hbox{|any| |pop|}&\n6:\cr
@ In order to perform these transformations we need a stack which is
indexed by |vf_ptr|, the number of |push| commands without corresponding
|pop| in the packet we are building; the |vf_push_loc| array contains
the locations in |byte_mem| following such |push| commands.
In view of rule~5 consecutive |push| commands are never stored, the
|vf_push_num| array is used to count them.
The |vf_last| array indicates the type of the last non-discardable item:
a character, a rule, or a group enclosed by |push| and |pop|;
the |vf_last_end| array points to the ending locations and, if
|vf_last<>vf_other|, the |vf_last_loc| array points to the starting
locations of these items.
@d vf_set=0 {|vf_set=char_cl|, last item is a |set_char| or \\{set}}
@d vf_rule=1 {|vf_rule=rule_cl|, last item is a |set_rule|}
@d vf_group=2 {last item is a group enclosed by |push| and |pop|}
@d vf_put=3 {last item is a |put|}
@d vf_other=4 {last item (if any) is none of the above}
@<Types...@>=
@!vf_type=vf_set..vf_other;
@ @<Glob...@>=
@!vf_move: array[stack_pointer] of vf_state; {state of |w|, |x|, |y|, and |z|}
@!vf_push_loc: array[stack_pointer] of byte_pointer; {end of a |push|}
@!vf_last_loc: array[stack_pointer] of byte_pointer; {start of an item}
@!vf_last_end: array[stack_pointer] of byte_pointer; {end of an item}
@!vf_push_num: array[stack_pointer] of eight_bits; {|push| count}
@!vf_last: array[stack_pointer] of vf_type; {type of last item}
@!vf_ptr:stack_pointer; {current number of unfinished groups}
@!stack_used:stack_pointer; {largest |vf_ptr| or |stack_ptr| value}
@ We use two small arrays to determine the item type of a character or a
rule.
@<Glob...@>=
@!vf_char_type:array[boolean] of vf_type;
@!vf_rule_type:array[boolean] of vf_type;
@ @<Set init...@>=
vf_move[0][0][0]:=false; vf_move[0][0][1]:=false;
vf_move[0][1][0]:=false; vf_move[0][1][1]:=false;@/
stack_used:=0;@/
vf_char_type[false]:=vf_put; vf_char_type[true]:=vf_set;@/
vf_rule_type[false]:=vf_other; vf_rule_type[true]:=vf_rule;
@ The |vf_do_char| procedure is used to read, analyze, and store a
character packet.
@p procedure vf_do_char; {read and store a \.{VF} packet}
label reswitch,done;
var temp_byte:int_8u; {byte for temporary variables}
@!temp_int:int_32; {integer for temporary variables}
@!k:byte_pointer; {index into |byte_mem|}
@!move_zero:boolean; {|true| if rule 1 is used}
@!last_pop:boolean; {|true| if final |pop| has been manufactured}
begin @<VF: Initialize the character packet@>;@/
@<VF: Append \.{DVI} commands to the character packet@>;@/
@<VF: Build the final form of the character packet@>;@/
@ Here we read the first bytes of a character packet from the \.{VF}
file and initialize the packet being built in |byte_mem|; the start of
the whole packet is stored in |vf_push_loc[0]|.
@<VF: Initialize the character packet@>=
if cur_cmd<long_char then
begin cur_parm:=cur_cmd;
cur_vf_ext:=0; cur_vf_res:=vf_ubyte; temp_int:=vf_fix3u;
end
else begin cur_parm:=vf_uquad;
cur_vf_ext:=vf_strio; cur_vf_res:=vf_ubyte; temp_int:=vf_fix4;
end;
if (cur_vf_res>=font_bc(cur_vf))and(cur_vf_res<=font_ec(cur_vf)) then
cur_vf_wp:=font_width(cur_vf)(cur_vf_res)
else cur_vf_wp:=invalid_width;
if cur_vf_wp=invalid_width then bad_vf;
check_width(cur_vf_wp,temp_int);
vf_limit:=vf_loc+cur_parm;
cur_fnt:=font_vf_fnt(cur_vf); vf_fnt:=cur_fnt;@/
start_packet(cur_vf_ext,font_vf_packet(cur_vf)(cur_vf_res),0);@/
vf_push_loc[0]:=byte_ptr; vf_last_end[0]:=byte_ptr;
vf_last[0]:=vf_other; vf_ptr:=0
@ For every \.{DVI} command read from the \.{VF} file some action is
performed; in addition the initial |push| and the final |pop| are
manufactured here.
@<VF: Append \.{DVI} commands to the character packet@>=
last_pop:=false;
cur_class:=push_cl; {initial |push|}
loop begin
reswitch:case cur_class of
three_cases(char_cl): @<VF: Do a |char|, |rule|, or |xxx|@>;
push_cl: @<VF: Do a |push|@>;
pop_cl: @<VF: Do a |pop|@>;
two_cases(w0_cl):
if vf_move[vf_ptr][0][cur_class-w0_cl] then append_one(cur_cmd);
three_cases(right_cl):
begin pckt_signed(dvi_right_cmd[cur_class],cur_parm);
if cur_class>=w_cl then vf_move[vf_ptr][0][cur_class-w_cl]:=true;
end;
two_cases(y0_cl):
if vf_move[vf_ptr][1][cur_class-y0_cl] then append_one(cur_cmd);
three_cases(down_cl):
begin pckt_signed(dvi_down_cmd[cur_class],cur_parm);
if cur_class>=y_cl then vf_move[vf_ptr][1][cur_class-y_cl]:=true;
end;
fnt_cl: vf_font;
fnt_def_cl: bad_vf;
invalid_cl: if cur_cmd<>nop then bad_vf;
end; {there are no other cases}
if vf_loc<vf_limit then vf_first_par
else if last_pop then goto done
else begin cur_class:=pop_cl; last_pop:=true; {final |pop|}
end;
end;
done:if (vf_ptr<>0)or(vf_loc<>vf_limit) then bad_vf
@ For a |push| we either increase |vf_push_num| or start a new level and
append a |push|.
@d incr_stack(#)==
if #=stack_used then
if stack_used=stack_size then overflow(str_stack,stack_size)
else incr(stack_used);
incr(#)
@<VF: Do a |push|@>=
if (vf_ptr>0)and(vf_push_loc[vf_ptr]=byte_ptr) then
begin if vf_push_num[vf_ptr]=255 then overflow(str_stack,255);
incr(vf_push_num[vf_ptr]);
end
else begin incr_stack(vf_ptr);
@<VF: Start a new level@>;
vf_push_num[vf_ptr]:=0;
end
@ @<VF: Start a new level@>=
append_one(push);
vf_move[vf_ptr]:=vf_move[vf_ptr-1];
vf_push_loc[vf_ptr]:=byte_ptr;
vf_last_end[vf_ptr]:=byte_ptr;
vf_last[vf_ptr]:=vf_other
@ When a character, a rule, or an |xxx| is appended, transformation
rule~1 might be applicable.
@<VF: Do a |char|, |rule|, or |xxx|@>=
begin if (vf_ptr=0)or(byte_ptr>vf_push_loc[vf_ptr]) then move_zero:=false
else case cur_class of
char_cl: move_zero:=(not cur_upd)or(cur_fnt<>vf_fnt);
rule_cl: move_zero:=not cur_upd;
xxx_cl: move_zero:=true;
end; {there are no other cases}
if move_zero then begin decr(byte_ptr); decr(vf_ptr);
end;
case cur_class of
char_cl: @<VF: Do a |fnt|, a |char|, or both@>;
rule_cl: @<VF: Do a |rule|@>;
xxx_cl: @<VF: Do an |xxx|@>;
end; {there are no other cases}
vf_last_end[vf_ptr]:=byte_ptr;
if move_zero then
begin incr(vf_ptr); append_one(push); vf_push_loc[vf_ptr]:=byte_ptr;
vf_last_end[vf_ptr]:=byte_ptr;
if cur_class=char_cl then if cur_upd then goto reswitch;
end;
@ A special situation arises if transformation rule~1 is applied to a
|fnt_num| of |fnt| command, but not to the |set_char| or \\{set} command
following it; in this case |cur_upd| and |move_zero| are both |true| and
the |set_char| or \\{set} command will be appended later.
@<VF: Do a |fnt|, a |char|, or both@>=
begin if cur_fnt<>vf_fnt then
begin vf_last[vf_ptr]:=vf_other;
pckt_unsigned(fnt1,cur_fnt); vf_fnt:=cur_fnt;
end;
if (not move_zero)or(not cur_upd) then
begin vf_last[vf_ptr]:=vf_char_type[cur_upd];
vf_last_loc[vf_ptr]:=byte_ptr;
pckt_char(cur_upd,cur_ext,cur_res);
end;
@ @<VF: Do a |rule|@>=
begin vf_last[vf_ptr]:=vf_rule_type[cur_upd];
vf_last_loc[vf_ptr]:=byte_ptr;
append_one(dvi_rule_cmd[cur_upd]);
pckt_four(cur_v_dimen); pckt_four(cur_h_dimen);
@ @<VF: Do an |xxx|@>=
begin vf_last[vf_ptr]:=vf_other;
pckt_unsigned(xxx1,cur_parm); pckt_room(cur_parm);
while cur_parm>0 do
begin append_byte(vf_ubyte); decr(cur_parm);
end;
@ Transformation rules 2--6 are triggered by a |pop|, either read from
the \.{VF} file or manufactured at the end of the packet.
@<VF: Do a |pop|@>=
begin if vf_ptr<1 then bad_vf;
byte_ptr:=vf_last_end[vf_ptr]; {this is rule 2}
if vf_last[vf_ptr]<=vf_rule then
if vf_last_loc[vf_ptr]=vf_push_loc[vf_ptr] then
@<VF: Prepare for rule 4@>;
if byte_ptr=vf_push_loc[vf_ptr] then @<VF: Apply rule 3 or 4@>
else begin if vf_last[vf_ptr]=vf_group then @<VF: Apply rule 6@>;
append_one(pop); decr(vf_ptr); vf_last[vf_ptr]:=vf_group;
vf_last_loc[vf_ptr]:=vf_push_loc[vf_ptr+1]-1;
vf_last_end[vf_ptr]:=byte_ptr;
if vf_push_num[vf_ptr+1]>0 then @<VF: Apply rule 5@>;
end;
@ In order to implement transformation rule~4, we cancel the |set_char|,
\\{set}, or |set_rule|, append a |pop|, and insert a |put| or |put_rule|
with the old parameters.
@<VF: Prepare for rule 4@>=
begin cur_class:=vf_last[vf_ptr]; cur_upd:=false;
byte_ptr:=vf_push_loc[vf_ptr];
@ @<VF: Apply rule 3 or 4@>=
begin if vf_push_num[vf_ptr]>0 then
begin decr(vf_push_num[vf_ptr]);
vf_move[vf_ptr]:=vf_move[vf_ptr-1];
end
else begin decr(byte_ptr); decr(vf_ptr);
end;
if cur_class<>pop_cl then goto reswitch; {this is rule 4}
@ @<VF: Apply rule 6@>=
begin Decr(byte_ptr)(2);
for k:=vf_last_loc[vf_ptr]+1 to byte_ptr do byte_mem[k-1]:=byte_mem[k];
vf_last[vf_ptr]:=vf_other; vf_last_end[vf_ptr]:=byte_ptr;
@ @<VF: Apply rule 5@>=
begin incr(vf_ptr);
@<VF: Start a new level@>;
decr(vf_push_num[vf_ptr]);
@ Finally a type is assigned to the packet just built: |vf_simple| if
the packet ends with a character of the correct width, or |vf_complex|
in all other cases; if a |vf_simple| packet for a character with
extension zero consists of just one character with extension zero and
the same residue, and if there is no previous packet, the whole packet
is replaced by the empty packet.
@d vf_simple=0 {the packet ends with a character of the correct width}
@d vf_complex=1 {otherwise}
@<VF: Build the final form of the character packet@>=
temp_byte:=vf_complex; {just in case}
if vf_last[0]=vf_put then if cur_wp=cur_vf_wp then temp_byte:=vf_simple;
k:=pckt_start[pckt_ptr]; Incr(byte_mem[k])(temp_byte);
if (bo(byte_mem[k])=0)and@|(vf_push_loc[0]=vf_last_loc[0])and@|
(cur_ext=0)and@|(cur_res=cur_vf_res) then byte_ptr:=k;
font_vf_packet(cur_vf)(cur_vf_res):=make_packet
@ The \.{VF} format specifies that after a character packet invoked by a
|set_char| or \\{set} command, ``|h|~is increased by the \.{TFM} width
(properly scaled)---just as if a simple character had been typeset'';
for |vf_simple| packets this is achieved by changing the final |put|
command into |set_char| or \\{set}, but for |vf_complex| packets an
explicit movement must be done. This poses a problem for programs,
such as \.{DVIcopy}, which write a new \.{DVI} file with all references
to characters from virtual fonts replaced by their character packets:
The \.{DVItype} program specifies that the horizontal movements after a
|set_char| or \\{set} command, after a |set_rule| command, and after one
of the commands |right1..x4|, are all treated differently when \.{DVI}
units are converted to pixels.
Thus we introduce a slight extension of \.{DVItype}'s pixel rounding
algorithm and hope that this extension will become part of the standard
\.{DVItype} program in the near future: If a \.{DVI} file contains a
|set_rule| command for a rule with the negative height |width_dimen|,
then this rule shall be treated in exactly the same way as a ficticious
character whose width is the width of that rule; as value of |width_dimen|
we choose $-2^{31}$, the smallest signed 32-bit integer.
@<Glob...@>=
@!width_dimen:int_32; {vertical dimension of special rules}
@ When initializing |width_dimen| we are careful to avoid arithmetic
overflow.
@<Set init...@>=
width_dimen:=-@"40000000; Decr(width_dimen)(@"40000000);
@ If no font directory has been specified, \.{\title} is supposed to use
the default \.{VF} directory, which is a system-dependent place where
the \.{VF} files for standard fonts are kept.
The string variable |VF_default_area| contains the name of this area.
@^system dependencies@>
@d VF_default_area_name=='TeXvfonts:' {change this to the correct name}
@d VF_default_area_name_length=10 {change this to the correct length}
@<Glob...@>=
@!VF_default_area:packed array[1..VF_default_area_name_length] of char;
@ @<Set init...@>=
VF_default_area:=VF_default_area_name;
@ The function |do_vf| attempts to read the \.{VF} file for a font and
returns |false| if the \.{VF} file could not be found; when the \.{VF}
file has been read, the font type is changed to |vf_font_type|.
@p function do_vf:boolean; {read a \.{VF} file}
label not_found,exit;
var temp_byte:int_8u; {byte for temporary variables}
@!k:int_15; {general purpose variable}
@!save_ext:int_24; {used to save |cur_ext|}
@!save_res:int_8u; {used to save |cur_res|}
@!save_wp:width_pointer; {used to save |cur_wp|}
@!save_upd:boolean; {used to save |cur_upd|}
begin save_ext:=cur_ext; save_res:=cur_res; save_wp:=cur_wp;
save_upd:=cur_upd; cur_vf:=cur_fnt; {save}
for k:=1 to VF_default_area_name_length do
cur_name[k]:=VF_default_area[k];
make_name(font_name(cur_vf),vf_ext,VF_default_area_name_length);@/
@<VF: Open |vf_file| or |goto not_found|@>;
font_type(cur_vf):=vf_font_type;@/
@<VF: Process the preamble@>;@/
@<VF: Process the font definitions@>;@/
@<VF: Process the character packets@>;@/
@!debug print('VF file for font ',cur_vf:1); print_font(cur_vf);
print_ln(' loaded.');
gubed @;@/
@<VF: Close |vf_file|@>;@/
cur_ext:=save_ext; cur_res:=save_res; cur_wp:=save_wp;
cur_upd:=save_upd; cur_fnt:=cur_vf; {restore}
do_vf:=true; return;
not_found:do_vf:=false;
exit:end;
@ @<VF: Process the preamble@>=
if vf_ubyte<>pre then bad_vf;
if vf_ubyte<>vf_id then bad_vf;
temp_byte:=vf_ubyte; pckt_room(temp_byte);
for k:=1 to temp_byte do append_byte(vf_ubyte);
print('VF file: '''); print_packet(new_packet); print_ln(''',');
flush_packet;
font_check(nf):=vf_squad;
check_check_sum(nf,font_check(cur_vf));
font_design(nf):=round(tfm_conv*vf_pquad);
check_design_size(nf,font_design(cur_vf));
print(' for font ',cur_vf:1); print_font(cur_vf); print_ln('.')
@ @<VF: Process the font definitions@>=
z:=font_scaled(cur_vf);
@<Replace |z|...@>;@/
vf_i_fnts[0]:=invalid_font; vf_nf:=0;@/
cur_cmd:=vf_ubyte;
while (cur_cmd>=fnt_def1)and(cur_cmd<=fnt_def1+3) do
begin case cur_cmd-fnt_def1 of
0: cur_parm:=vf_ubyte;
1: cur_parm:=vf_upair;
2: cur_parm:=vf_utrio;
3: cur_parm:=vf_squad;
end; {there are no other cases}
vf_do_font;
cur_cmd:=vf_ubyte;
end;
font_vf_fnt(cur_vf):=vf_i_fnts[0]
@ @<VF: Process the character packets@>=
font_vf_chars(cur_vf):=make_char_packets(cur_vf); {allocate packets}
while cur_cmd<=long_char do
begin vf_do_char;
cur_cmd:=vf_ubyte;
end;
if cur_cmd<>post then bad_vf
@* Low-level output routines.
The program uses the binary file variable |out_file| for its main output
file; |out_loc| is the number of the byte about to be written next on
|out_file|.
@<Glob...@>=
@!out_file:byte_file; {the \.{DVI} file we are writing}
@!out_loc:int_32; {where we are about to write, in |out_file|}
@!out_back:int_32; {a back pointer}
@!out_max_v:int_31; {maximum |v| value so far}
@!out_max_h:int_31; {maximum |h| value so far}
@!out_stack:int_16u; {maximum stack depth}
@!out_pages:int_16u; {total number of pages}
@ @<Set ini...@>=
out_loc:=0; out_back:=-1;
out_max_v:=0; out_max_h:=0;
out_stack:=0; out_pages:=0;
@ To prepare |out_file| for output, we |rewrite| it.
@<Open output file(s)@>=
rewrite(out_file); {prepares to write packed bytes to |out_file|}
@ For some operating systems it may be necessary to close |out_file|.
@<Close output file(s)@>=
@ Writing the |out_file| should be done as efficient as possible for a
particular system; on many systems this means that a large number of
bytes will be accumulated in a buffer and is then written from that
buffer to |out_file|. In order to simplify such system dependent changes
we use the \.{WEB} macro |out_byte| to write the next \.{DVI} byte. Here
we give a simple minded definition for this macro in terms of standard
\PASCAL.
@^system dependencies@>
@^optimization@>
@d out_byte(#) == write(out_file,#) {write next \.{DVI} byte}
@ The \.{WEB} macro |out_one| is used to write one byte and to update
|out_loc|.
@d out_one(#) == begin out_byte(#); incr(out_loc); @+ end
@ First the |out_packet| procedure copies a packet to |out_file|.
@p procedure out_packet(@!p:pckt_pointer);
var k:byte_pointer; {index into |byte_mem|}
begin Incr(out_loc)(pckt_length(p));
for k:=pckt_start[p] to pckt_start[p+1]-1 do out_byte(bo(byte_mem[k]));
@ Next are the procedures used to write integer numbers or even complete
\.{DVI} commands to |out_file|; they all keep |out_loc| up to date.
The |out_four| procedure outputs four bytes in two's complement notation,
without risking arithmetic overflow.
@p procedure out_four(@!x:int_32); {output four bytes}
@!begin_four; comp_four(out_byte); Incr(out_loc)(4);
@ The |out_char| procedure outputs a |set_char| or \\{set} command or, if
|upd=false|, a |put| command.
@p procedure out_char(@!upd:boolean;@!ext:int_32;@!res:eight_bits);
{output \\{set} or |put|}
@!begin_char; comp_char(out_one);
@ The |out_unsigned| procedure outputs a |fnt|, |xxx|, or |fnt_def|
command with its first parameter (normally unsigned); a |fnt| command
is converted into |fnt_num| whenever this is possible.
@p procedure out_unsigned(@!o:eight_bits;@!x:int_32);
{output |fnt_num|, |fnt|, |xxx|, or |fnt_def|}
@!begin_unsigned; comp_unsigned(out_one);
@ The |out_signed| procedure outputs a movement (|right|, |w|,
|x|, |down|, |y|, or |z|) command with its (signed) parameter.
@p procedure out_signed(@!o:eight_bits;@!x:int_32);
{output |right|, |w|, |x|, |down|, |y|, or |z|}
@!begin_signed; comp_signed(out_one);
@ Here we define the additional |font_data| fields required for the real
fonts used in |out_file|.
In order to simplify the \.{web2c} translation the fields in the variant
for |f_type=out_font_type| are accessed through the \.{WEB} macro
|out_font_data|.
@^font types@>@.web2c@>
@d out_font_data(#)==font_data[#] {access |out_font_type| variant fields}
@d font_out(#)==out_font_data(#).out_field {font number in |out_file|}
@<Cases for |font_record|@>=
out_font_type:
(@!out_field:font_number); {font number in |out_file|}
@ The global variable |out_nf| is the number of fonts already used in
|out_file| and the array |out_fnts| contains their internal font numbers;
the current font in |out_file| is called |out_fnt|.
@<Glob...@>=
@!out_fnts:array[font_number] of font_number; {internal font numbers}
@!out_nf:font_number; {number of fonts used in |out_file|}
@!out_fnt:font_number; {internal font number of current output font}
@ @<Set init...@>=
out_nf:=0;
@ @<Print more font usage statistics@>=
print(out_nf:1,' out, ');
@ The |out_fnt_def| procedure outputs a complete font definition
command.
@p procedure out_fnt_def(@!f:font_number);
var p:pckt_pointer; {the font name packet}
@!k,@!l:byte_pointer; {indices into |byte_mem|}
@!a:eight_bits; {length of area part}
begin out_unsigned(fnt_def1,font_out(f)); out_four(font_check(f));
out_four(font_scaled(f)); out_four(font_design(f));@/
p:=font_name(f); k:=pckt_start[p]; l:=pckt_start[p+1]-1;
a:=bo(byte_mem[k]);@/
Incr(out_loc)(l-k+2); out_byte(a); out_byte(l-k-a);
while k<l do
begin incr(k); out_byte(bo(byte_mem[k]));
end;
@* Writing the output file.
Here we define the device dependent parts of the typesetting routines
described later in this program.
The device dependent code for a real output device must define a few constants;
here we demonstrate how they should be defined.
@d h_resolution=300 {horizontal resolution in pixels per inch (dpi)}
@d v_resolution=300 {vertical resolution in pixels per inch (dpi)}
@ These are the local variables (if any) needed for |do_pre|.
@<OUT: Local variables for |do_pre|@>=
var k:int_15; {general purpose variable}
@!p,@!q,@!r:byte_pointer; {indices into |byte_mem|}
@!comment:packed array[1..comm_length] of char; {preamble comment prefix}
@ And here is the device dependent code for |do_pre|; the \.{DVI} preamble
comment written to |out_file| is similar to the one produced by \.{GFtoPK},
but we want to prepend our own preamble comment string only once.
@<OUT: Process the |pre|@>=
out_one(pre); out_one(dvi_id);
out_four(dvi_num); out_four(dvi_den); out_four(dvi_mag);@/
p:=pckt_start[pckt_ptr-1]; q:=byte_ptr; {location of old \.{DVI} comment}
comment:=preamble_comment; pckt_room(comm_length);
for k:=1 to comm_length do append_byte(xord[comment[k]]);
while byte_mem[p]=bi(" ") do incr(p); {remove leading blanks}
if p=q then Decr(byte_ptr)(from_length)
else begin k:=0;
while (k<comm_length)and(byte_mem[p+k]=byte_mem[q+k]) do incr(k);
if k=comm_length then Incr(p)(comm_length);
end;
k:=byte_ptr-p; {total length}
if k>255 then
begin k:=255; q:=p+255-comm_length; {at most 255 bytes}
end;
out_one(k); out_packet(new_packet); flush_packet;
for r:=p to q-1 do out_one(bo(byte_mem[r]));
@ These are the additional local variables (if any) needed for |do_bop|;
the variables |@!i| and |@!j| are already declared.
@<OUT: More local variables for |do_bop|@>=
@ And here is the device dependent code for |do_bop|.
@<OUT: Process a |bop|@>=
out_one(bop); incr(out_pages);
for i:=0 to 9 do out_four(count[i]);
out_four(out_back); out_back:=out_loc-45;
out_fnt:=invalid_font;
@ These are the local variables (if any) needed for |do_eop|.
@<OUT: Local variables for |do_eop|@>=
@{var@}
@ And here is the device dependent code for |do_eop|.
@<OUT: Process an |eop|@>=
out_one(eop);
@ These are the local variables (if any) needed for |do_push|.
@<OUT: Local variables for |do_push|@>=
@{var@}
@ And here is the device dependent code for |do_push|.
@<OUT: Process a |push|@>=
if stack_ptr>out_stack then out_stack:=stack_ptr;
out_one(push);
@ These are the local variables (if any) needed for |do_pop|.
@<OUT: Local variables for |do_pop|@>=
@{var@}
@ And here is the device dependent code for |do_pop|.
@<OUT: Process a |pop|@>=
out_one(pop);
@ These are the additional local variables (if any) needed for |do_xxx|;
the variable |@!p|, the pointer to the packet containing the special
string, is already declared.
@<OUT: More local variables for |do_xxx|@>=
@ And here is the device dependent code for |do_xxx|.
@<OUT: Process an |xxx|@>=
out_unsigned(xxx1,pckt_length(p)); out_packet(p);
@ These are the local variables (if any) needed for |do_right|.
@<OUT: Local variables for |do_right|@>=
@{var@}
@ And here is the device dependent code for |do_right|.
@<OUT: Process a |right| or |w| or |x|@>=
if cur_class<right_cl then out_one(cur_cmd) {|w0| or |x0|}
else out_signed(dvi_right_cmd[cur_class],cur_parm); {|right|, |w|, or |x|}
@ Here we update the |out_max_h| value.
@<OUT: Move right@>=
if abs(cur_h)>out_max_h then out_max_h:=abs(cur_h);
@ These are the local variables (if any) needed for |do_down|.
@<OUT: Local variables for |do_down|@>=
@{var@}
@ And here is the device dependent code for |do_down|.
@<OUT: Process a |down| or |y| or |z|@>=
if cur_class<down_cl then out_one(cur_cmd) {|y0| or |z0|}
else out_signed(dvi_down_cmd[cur_class],cur_parm); {|down|, |y|, or |z|}
@ Here we update the |out_max_v| value.
@<OUT: Move down@>=
if abs(cur_v)>out_max_v then out_max_v:=abs(cur_v);
@ These are the local variables (if any) needed for |do_width|.
@<OUT: Local variables for |do_width|@>=
@{var@}
@ And here is the device dependent code for |do_width|.
@<OUT: Typeset a |width|@>=
out_one(set_rule);
out_four(width_dimen); out_four(cur_h_dimen);
@ These are the additional local variables (if any) needed for |do_rule|;
the variable |@!visible| is already declared.
@<OUT: More local variables for |do_rule|@>=
@ And here is the device dependent code for |do_rule|.
@<OUT: Typeset a visible |rule|@>=
out_one(dvi_rule_cmd[cur_upd]);
out_four(cur_v_dimen); out_four(cur_h_dimen);
@ @<OUT: Typeset an invisible |rule|@>=
@<OUT: Typeset a visible |rule|@>
@ These are the local variables (if any) needed for |do_font|.
@<OUT: Local variables for |do_font|@>=
@{var@}
@ And here is the device dependent code for |do_font|; if the \.{VF} file
for a file could not be found, we simply assume this must be a real font.
@<OUT: Look for a font file before trying to read the \.{VF} file;
if found |goto done|@>=
@ @<OUT: Look for a font file after trying to read the \.{VF} file;
if found |goto done|@>=
if(out_nf>=max_fonts) then overflow(str_fonts,max_fonts);
print('OUT: font ',cur_fnt:1); d_print(' => ',out_nf:1);
print_font(cur_fnt);
d_print(' at ',font_scaled(cur_fnt):1,' DVI units'); print_ln('.');
font_type(cur_fnt):=out_font_type; font_out(cur_fnt):=out_nf;
out_fnts[out_nf]:=cur_fnt; incr(out_nf);
out_fnt_def(cur_fnt); goto done;
@ These are the local variables (if any) needed for |do_char|.
@<OUT: Local variables for |do_char|@>=
@{var@}
@ And here is the device dependent code for |do_char|.
@<OUT: Typeset a |char|@>=
begin @!debug if font_type(cur_fnt)<>out_font_type then confusion(str_fonts);
gubed @;
if cur_fnt<>out_fnt then
begin out_unsigned(fnt1,font_out(cur_fnt)); out_fnt:=cur_fnt;
end;
out_char(cur_upd,cur_ext,cur_res);
@ If the program terminates in the middle of a page, we write as many
|pop|s as necessary and one |eop|.
@<OUT: Finish incomplete page@>=
begin while stack_ptr>0 do
begin out_one(pop); decr(stack_ptr);
end;
out_one(eop);
@ If the output file has been started, we write the postamble; in
addition we print the number of bytes and pages written to |out_file|.
@<OUT: Finish output file(s)@>=
if out_loc>0 then
begin @<OUT: Write the postamble@>;
k:=7-((out_loc-1) mod 4); {the number of 223's}
while k>0 do
begin out_one(223); decr(k);
end;
print('OUT file: ',out_loc:1,' bytes, ',out_pages:1,' page');
if out_pages<>1 then print('s');
end
else print('OUT file: no output');
print_ln(' written.');
if out_pages=0 then mark_harmless;
@ Here we simply write the values accumulated during the \.{DVI} output.
@<OUT: Write the postamble@>=
out_one(post); out_four(out_back); out_back:=out_loc-5;@/
out_four(dvi_num); out_four(dvi_den); out_four(dvi_mag);@/
out_four(out_max_v); out_four(out_max_h);@/
out_one(out_stack div @"100); out_one(out_stack mod @"100);@/
out_one(out_pages div @"100); out_one(out_pages mod @"100);@/
k:=out_nf;
while k>0 do
begin decr(k); out_fnt_def(out_fnts[k]);
end;
out_one(post_post); out_four(out_back);@/
out_one(dvi_id)
@ Here we could print more memory usage statistics; this possibility is,
however, not used for \.{DVIcopy}.
@<Print more memory usage statistics@>=
@* Subroutines for typesetting commands.
This is the central part of the whole \.{\title} program:
When a typesetting command from the \.{DVI} file or from a \.{VF} packet
has been decoded, one of the typesetting routines defined below is
invoked to execute the command; apart from the necessary book keeping,
these routines invoke device dependent code defined earlier.
These typesetting routines communicate with the rest of the program
through global variables.
@<Glob...@>=
@!type_setting:boolean; {|true| while typesetting a page}
@!count:array[0..9] of int_32; {counts from last |bop| command}
@!device
@!h_conv:real; {converts \.{DVI} units to horizontal pixels}
@!v_conv:real; {converts \.{DVI} units to vertical pixels}
@!h_pixels:pix_value; {a horizontal dimension in pixels}
@!v_pixels:pix_value; {a vertical dimension in pixels}
@!temp_pix:pix_value; {temporary value for pixel rounding}
ecived
@ @<Set init...@>=
type_setting:=false;
@ A stack is used to keep track of the current horizonal and vertical
position, |h| and |v|, and the four registers |w|, |x|, |y|, and |z|;
the register pairs |(w,x)| and |(y,z)| are maintained as arrays.
@<Types...@>=
@!stack_pointer=0..stack_size;@/
@!pair_32=array[0..1] of int_32; {a pair of |int_32| variables}
@!stack_record=record@;@/
@!h_field:int_32; {horizontal position |h|}
@!v_field:int_32; {vertical position |v|}
@!device
@!hh_field:pix_value; {horizontal pixel position |hh|}
@!vv_field:pix_value; {vertical pixel position |vv|}
ecived @; @/
@!w_x_field:pair_32; {|w| and |x| register for horizontal movements}
@!y_z_field:pair_32; {|y| and |z| register for vertical movements}
end;
@ The current values are kept in |cur_stack|; they are pushed onto and
popped from |stack|. We use \.{WEB} macros to access the current values.
@d cur_h==cur_stack.h_field {the current |@!h| value}
@d cur_v==cur_stack.v_field {the current |@!v| value}
@d cur_hh==cur_stack.hh_field {the current |@!hh| value}
@d cur_vv==cur_stack.vv_field {the current |@!vv| value}
@d cur_w_x==cur_stack.w_x_field {the current |@!w| and |@!x| value}
@d cur_y_z==cur_stack.y_z_field {the current |@!y| and |@!z| value}
@<Glob...@>=
@!stack:array[1..stack_size] of stack_record; {the pushed values}
@!cur_stack:stack_record; {the current values}
@!zero_stack:stack_record; {initial values}
@!stack_ptr:stack_pointer; {last used position in |stack|}
@ @<Set init...@>=
stack_ptr:=0;
zero_stack.h_field:=0; zero_stack.v_field:=0;
@!device zero_stack.hh_field:=0; zero_stack.vv_field:=0; @+ ecived @; @/
for i:=0 to 1 do
begin zero_stack.w_x_field[i]:=0; zero_stack.y_z_field[i]:=0;
end;
@ A sequence of consecutive rules, or consecutive characters in a fixed-width
font whose width is not an integer number of pixels, can cause |hh| to drift
far away from a correctly rounded value. \.{\title} ensures that the
amount of drift will never exceed |max_h_drift| pixels; similarly |vv|
shall never drift away from the correctly rounded value by more than
|max_v_drift| pixels.
@d max_h_drift=2 {we insist that abs|(hh-h_pixel_round(h))<=max_drift|}
@d max_v_drift=2 {we insist that abs|(vv-v_pixel_round(v))<=max_drift|}
@ Let us start with the simple cases:
The |do_pre| procedure is called when the preamble has been read from
the \.{DVI} file; the preamble comment has just been converted into a
temporary packet with the |new_packet| procedure.
@p procedure do_pre;@/
@<OUT: Local variables for |do_pre|@>@;
begin @!device
h_conv:=(dvi_num/254000.0)*(h_resolution/dvi_den)*(dvi_mag/1000.0);
v_conv:=(dvi_num/254000.0)*(v_resolution/dvi_den)*(dvi_mag/1000.0);
ecived @; @/
@<OUT: Process the |pre|@>@;@/
@ The |do_bop| procedure is called when a |bop| has been read. This
routine determines whether a page shall be processed or skipped and sets
the variable |type_setting| accordingly.
@p procedure do_bop;
var i,@!j:0..9; {indices into |count|}
@<OUT: More local variables for |do_bop|@>@;
begin @<Determine whether this page should be processed or skipped@>;
print('DVI: ');
if type_setting then
begin
cur_stack:=zero_stack; cur_fnt:=invalid_font;@/
@<OUT: Process a |bop|@>@;@/
print('processing');
end
else print('skipping');
print(' page ',count[0]:1); j:=9;
while (j>0)and(count[j]=0) do decr(j);
for i:=1 to j do print('.',count[i]:1);
d_print(' at ',dvi_loc-45:1);
print_ln('.');
@ For the moment a page selection mechanism is not yet implemented,
i.e., all pages are processed.
@<Determine whether this page...@>=
type_setting:=true
@ The |do_eop| procedure is called in order to process an |eop|;
the stack should be empty.
@p procedure do_eop;@/
@<OUT: Local variables for |do_eop|@>@;
begin if stack_ptr<>0 then bad_dvi;
@<OUT: Process an |eop|@>@;
type_setting:=false;
@ The procedures |do_push| and |do_pop| are called in order to process
|push| and |pop| commands; |do_push| must check for stack overflow,
|do_pop| should never be called when the stack is empty.
@p procedure do_push; {push onto stack}
@<OUT: Local variables for |do_push|@>@;
begin incr_stack(stack_ptr); stack[stack_ptr]:=cur_stack;@/
@<OUT: Process a |push|@>@;
procedure do_pop; {pop from stack}
@<OUT: Local variables for |do_pop|@>@;
begin if stack_ptr=0 then bad_dvi;
cur_stack:=stack[stack_ptr]; decr(stack_ptr);@/
@<OUT: Process a |pop|@>@;
@ The |do_xxx| procedure is called in order to process a special command;
the bytes of the special string have been put into |byte_mem| as the
current string. They are converted to a temporary packet and discarded
again.
@p procedure do_xxx;
var p:pckt_pointer; {temporary packet}
@<OUT: More local variables for |do_xxx|@>@;
begin p:=new_packet;@/
@<OUT: Process an |xxx|@>@;@/
flush_packet;
@ Next are the movement commands:
The |do_right| procedure is called in order to process the horizontal
movement commands |right|, |w|, and |x|.
@d do_h_pixels(#)== {check for proper horizontal pixel rounding}
begin Incr(cur_hh)(#); temp_pix:=h_pixel_round(cur_h);
if abs(temp_pix-cur_hh)>max_h_drift then
if temp_pix>cur_hh then cur_hh:=temp_pix-max_h_drift
else cur_hh:=temp_pix+max_h_drift;
@p procedure do_right;@/
@<OUT: Local variables for |do_right|@>@;
begin if cur_class>=w_cl then cur_w_x[cur_class-w_cl]:=cur_parm
else if cur_class<right_cl then cur_parm:=cur_w_x[cur_class-w0_cl];
@<OUT: Process a |right| or |w| or |x|@>@;@/
Incr(cur_h)(cur_parm);
@!device
if (cur_parm>=font_space(cur_fnt))or(cur_parm<=-4*font_space(cur_fnt)) then
cur_hh:=h_pixel_round(cur_h)
else do_h_pixels(h_pixel_round(cur_parm));
ecived @; @/
@<OUT: Move right@>@;
@ The |do_down| procedure is called in order to process the vertical
movement commands |down|, |y|, and |z|.
@d do_v_pixels(#)== {check for proper vertical pixel rounding}
begin Incr(cur_vv)(#); temp_pix:=v_pixel_round(cur_v);
if abs(temp_pix-cur_vv)>max_v_drift then
if temp_pix>cur_vv then cur_vv:=temp_pix-max_v_drift
else cur_vv:=temp_pix+max_v_drift;
@p procedure do_down;
@<OUT: Local variables for |do_down|@>@;
begin if cur_class>=y_cl then cur_y_z[cur_class-y_cl]:=cur_parm
else if cur_class<down_cl then cur_parm:=cur_y_z[cur_class-y0_cl];
@<OUT: Process a |down| or |y| or |z|@>@;@/
Incr(cur_v)(cur_parm);
@!device
if abs(cur_parm)>=5*font_space(cur_fnt) then cur_vv:=v_pixel_round(cur_v)
else do_v_pixels(v_pixel_round(cur_parm));
ecived @; @/
@<OUT: Move down@>@;
@ The |do_width| procedure is called in order to increase the current
horizontal position |cur_h| by |cur_h_dimen| in exactly the same way
as if a character of width |cur_h_dimen| had been typeset.
@p procedure do_width;@/
@<OUT: Local variables for |do_width|@>@;
begin @<OUT: Typeset a |width|@>@;@/
Incr(cur_h)(cur_h_dimen);
@!device do_h_pixels(h_pixels); @+ ecived @/ @;
@<OUT: Move right@>@;
@ Finally we have the commands for the typesetting of rules and characters;
the global variable |cur_upd| is |true| if the horizontal position shall
be updated (\\{set} commands).
Here are two other subroutine that we need: They computes the number of
pixels in the height or width of a rule. Characters and rules will line up
properly if the sizes are computed precisely as specified here. (Since
|h_conv| and |v_conv| are computed with some floating-point roundoff error,
in a machine-dependent way, format designers who are tailoring something for
a particular resolution should not plan their measurements to come out to an
exact integer number of pixels; they should compute things so that the
rule dimensions are a little less than an integer number of pixels, e.g.,
4.99 instead of 5.00.)
@p @!device
function h_rule_pixels(x:int_32):pix_value;
{computes $\lceil|h_conv|\cdot x\rceil$}
var n:int_32;
begin n:=trunc(h_conv*x);
if n<h_conv*x then h_rule_pixels:=n+1 @+ else h_rule_pixels:=n;
function v_rule_pixels(x:int_32):pix_value;
{computes $\lceil|v_conv|\cdot x\rceil$}
var n:int_32;
begin n:=trunc(v_conv*x);
if n<v_conv*x then v_rule_pixels:=n+1 @+ else v_rule_pixels:=n;
ecived
@ The |do_rule| procedure is called in order to typeset a rule.
@p procedure do_rule;@/
var visible:boolean;
@<OUT: More local variables for |do_rule|@>@;
begin if (cur_h_dimen>0)and(cur_v_dimen>0) then
begin visible:=true;
@!device
h_pixels:=h_rule_pixels(cur_h_dimen);
v_pixels:=v_rule_pixels(cur_v_dimen);
ecived @; @/
@<OUT: Typeset a visible |rule|@>@;
end
else begin visible:=false;
@<OUT: Typeset an invisible |rule|@>@;
end;
if cur_upd then
begin Incr(cur_h)(cur_h_dimen);
@!device if not visible then h_pixels:=h_rule_pixels(cur_h_dimen);
do_h_pixels(h_pixels);
ecived @; @/
@<OUT: Move right@>@;
end;
@ Last not least the |do_char| procedure is called in order to typeset
character~|cur_res| with extension~|cur_ext| from the real font~|cur_fnt|.
@p procedure do_char;@/
@<OUT: Local variables for |do_char|@>@;
begin
@<OUT: Typeset a |char|@>;
if cur_upd then
begin Incr(cur_h)(widths[cur_wp]);
@!device do_h_pixels(font_pixel(cur_fnt)(cur_res)); @+ ecived @; @/
@<OUT: Move right@>@;
end;
@ If the program terminates abnormally, the following code may be
invoked in the middle of a page.
@<Finish output file(s)@>=
if type_setting then @<OUT: Finish incomplete page@>@;
@<OUT: Finish output file(s)@>
@ When the first character of font~|cur_fnt| is about to be typeset,
the |do_font| procedure is called in order to decide whether this is
a virtual font or a real font.
One step in this decision is the attempt to find and read the \.{VF}
file for this font; other attempts to locate a font file may be performed
before and after that, depending on the nature of the output device and
on the structure of the file system at a particular installation.
In any case |do_font| must change |font_type(cur_fnt)| from |new_font_type|
to anything else; as a last resort one might use the \.{TFM} width data
and leave blank spaces in the output.
@p procedure do_font;
label done;
@<OUT: Local variables for |do_font|@>@;
begin
@<OUT: Look for a font file before trying to read the \.{VF} file;
if found |goto done|@>@;@/
if do_vf then goto done; {try to read the \.{VF} file}
@<OUT: Look for a font file after trying to read the \.{VF} file;
if found |goto done|@>@;@/
done:
@!debug if font_type(cur_fnt)=new_font_type then confusion(str_fonts);
gubed@;
@* Interpreting VF packets.
The |pckt_first_par| procedure first reads a \.{DVI} command byte from
the packet into |cur_cmd|; then |cur_parm| is set to the value of the
first parameter (if any) and |cur_class| to the command class.
@p procedure pckt_first_par;
begin cur_cmd:=pckt_ubyte;
case dvi_par[cur_cmd] of
char_par: if cur_cmd<set1 then
begin cur_ext:=0; cur_res:=cur_cmd; cur_upd:=true
end
else begin cur_upd:=(cur_cmd<put1);
case cur_cmd-dvi_char_cmd[cur_upd] of
0: cur_ext:=0;
1: cur_ext:=pckt_ubyte;
2: cur_ext:=pckt_upair;
3: cur_ext:=pckt_strio;
end;
cur_res:=pckt_ubyte;
end;
no_par: do_nothing;
dim1_par: cur_parm:=pckt_sbyte;
num1_par: cur_parm:=pckt_ubyte;
dim2_par: cur_parm:=pckt_spair;
num2_par: cur_parm:=pckt_upair;
dim3_par: cur_parm:=pckt_strio;
num3_par: cur_parm:=pckt_utrio;
three_cases(dim4_par): cur_parm:=pckt_squad; {|dim4|, |num4|, or |numu|}
rule_par:
begin cur_v_dimen:=pckt_squad; cur_h_dimen:=pckt_squad;
cur_upd:=(cur_cmd=set_rule);
end;
fnt_par:cur_parm:=cur_cmd-fnt_num_0;
cur_class:=dvi_cl[cur_cmd];
@ The |do_vf_packet| procedure is called in order to interpret the
character packet for a virtual character. Such a packet may contain the
instruction to typeset a character from the same or an other virtual
virtual font; in such cases |do_vf_packet| calls itself recursively.
The recursion level, i.e., the number of times this has happened, is
kept in the global variable |n_recur| and should not exceed
|max_recursion|.
@^recursion@>
@<Types...@>=
@!recur_pointer=0..max_recursion;
@ The \.{\title} processor should detect an infinite recursion caused by
bad \.{VF} files; thus a new recursion level is entered even in cases
where this could be avoided without difficulty.
If the recursion level exceeds the allowed maximum, we want to give
a traceback how this has happened; thus some of the global variables
used in different invocations of |do_vf_packet| are saved in a stack,
others are saved as local variables of |do_vf_packet|.
@<Glob...@>=
@!recur_fnt:array[recur_pointer] of font_number; {this packet's font}
@!recur_ext:array[recur_pointer] of int_24; {this packet's extension}
@!recur_res:array[recur_pointer] of eight_bits; {this packet's residue}
@!recur_pckt:array[recur_pointer] of pckt_pointer; {the packet}
@!recur_loc:array[recur_pointer] of byte_pointer; {next byte of packet}
@!n_recur:recur_pointer; {current recursion level}
@!recur_used:recur_pointer; {highest recursion level used so far}
@ @<Set init...@>=
n_recur:=0; recur_used:=0;
@ Here now is the |do_vf_packet| procedure.
@p procedure do_vf_packet;
label continue,found,done;
var k:recur_pointer; {loop index}
@!p:pckt_pointer; {a packet}
@!f:int_8u; {packet type flag}
@!save_upd:boolean; {used to save |cur_upd|}
@!save_wp:width_pointer; {used to save |cur_wp|}
@!save_limit:byte_pointer; {used to save |cur_limit|}
begin @<VF: Save values on entry to |do_vf_packet|@>;@/
@<VF: Initialize variables for |do_vf_packet|; or |goto done|@>;@/
@<VF: Interpret the \.{DVI} commands in the packet@>;@/
done:if save_upd then
begin cur_h_dimen:=widths[save_wp];
@!device h_pixels:=pix_widths[save_wp]; @+ ecived @; @/
do_width;
end;
@<VF: Restore values on exit from |do_vf_packet|@>;@/
@ On entry to |do_vf_packet| several values must be saved.
@<VF: Save values on entry to |do_vf_packet|@>=
save_upd:=cur_upd;
save_wp:=cur_wp;@/
recur_fnt[n_recur]:=cur_fnt;
recur_ext[n_recur]:=cur_ext;
recur_res[n_recur]:=cur_res
@ Some of these values must be restored on exit from |do_vf_packet|.
@<VF: Restore values on exit from |do_vf_packet|@>=
cur_fnt:=recur_fnt[n_recur]
@ Here we initialize the variables needed to interpret a character
packet.
@<VF: Initialize variables for |do_vf_packet|; or |goto done|@>=
p:=font_vf_packet(cur_fnt)(cur_res);
if p=invalid_packet then
begin pckt_warning; goto done;
end;
f:=find_packet(cur_ext,p);
recur_pckt[n_recur]:=cur_pckt;
save_limit:=cur_limit;
cur_fnt:=font_vf_fnt(cur_fnt)
@ If |cur_pckt| is the empty packet, we manufacture a |put| command;
otherwise we read and interpret \.{DVI} commands from the packet.
@<VF: Interpret the \.{DVI} commands in the packet@>=
if cur_pckt=empty_packet then
begin cur_class:=char_cl; goto found;
end;
if cur_loc>=cur_limit then goto done;
continue: pckt_first_par;
found: case cur_class of
char_cl: @<VF: Typeset a |char|@>;
rule_cl: do_rule;
xxx_cl:
begin pckt_room(cur_parm);
while cur_parm>0 do
begin append_byte(pckt_ubyte); decr(cur_parm);
end;
do_xxx;
end;
push_cl: do_push;
pop_cl: do_pop;
five_cases(w0_cl): do_right; {|right|, |w|, or |x|}
five_cases(y0_cl): do_down; {|down|, |y|, or |z|}
fnt_cl: cur_fnt:=cur_parm;
othercases confusion(str_packets); {font definition or invalid}
endcases;
if cur_loc<cur_limit then goto continue
@ When a font is used for the first time, the |do_font| procedure is
called to decide whether this is a virtual font or not.
The final |put| of a simple packet may be changed into |set_char| or
\\{set}.
@<VF: Typeset a |char|@>=
begin cur_wp:=font_width(cur_fnt)(cur_res);
if font_type(cur_fnt)=new_font_type then do_font; {|cur_fnt| was not yet used}
if (cur_loc=cur_limit)and(f=vf_simple) and save_upd then
begin save_upd:=false; cur_upd:=true;
end;
if font_type(cur_fnt)=vf_font_type then
@<VF: Enter a new recursion level@>
else do_char;
@ Before entering a new recursion level we must test for overflow; in
addition a few variables must be saved and restored.
A |set_char| or \\{set} followed by |pop| is changed into |put|.
@<VF: Enter a new recursion level@>=
begin recur_loc[n_recur]:=cur_loc; {save}
if cur_loc<cur_limit then
if byte_mem[cur_loc]=bi(pop) then cur_upd:=false;
if n_recur=recur_used then
if recur_used=max_recursion then
@<VF: Display the recursion traceback and terminate@>
else incr(recur_used);@/
incr(n_recur);
do_vf_packet;
decr(n_recur); {recurse}
cur_loc:=recur_loc[n_recur];
cur_limit:=save_limit; {restore}
@ @<VF: Display the recursion traceback and terminate@>=
begin print_ln(' !Infinite VF recursion?');
@.Infinite VF recursion@>
for k:=max_recursion downto 0 do
begin print('level=',k:1,' font');
d_print('=',recur_fnt[k]:1);
print_font(recur_fnt[k]);
print(' char=',recur_res[k]:1);
if recur_ext[k]<>0 then print('.',recur_ext[k]:1);
print_ln(' ');
@!debug hex_packet(recur_pckt[k]); print_ln('loc=',recur_loc[k]:1);
gubed@;
end;
overflow(str_recursion,max_recursion);
@* Interpreting the DVI file.
When a |bop| has been read, the |do_page| procedure is called to
interpret one page of the \.{DVI} file; |do_page| returns when the
corresponding |eop| has been read.
@p procedure do_page;
label done;
var temp_byte:int_8u; {byte for temporary variables}
@!temp_int:int_32; {integer for temporary variables}
@!k:int_15; {general purpose variable}
begin for k:=0 to 9 do count[k]:=dvi_squad;
temp_int:=dvi_squad; do_bop;
dvi_first_par;
if type_setting then @<DVI: Process a page; then |goto done|@>
else @<DVI: Skip a page; then |goto done|@>;
done:if cur_cmd<>eop then bad_dvi;
if type_setting then do_eop;
@ All \.{DVI} commands are processed, as long as |cur_class<>invalid_cl|;
then we should have found an |eop|.
@<DVI: Process a page; then |goto done|@>=
loop begin
case cur_class of
char_cl: @<DVI: Typeset a |char|@>;
rule_cl:
if cur_upd and(cur_v_dimen=width_dimen) then
begin @!device h_pixels:=h_pixel_round(cur_h_dimen); @+ ecived @; @/
do_width;
end
else do_rule;
xxx_cl:
begin pckt_room(cur_parm);
while cur_parm>0 do
begin append_byte(dvi_ubyte); decr(cur_parm);
end;
do_xxx;
end;
push_cl: do_push;
pop_cl: do_pop;
five_cases(w0_cl): do_right; {|right|, |w|, or |x|}
five_cases(y0_cl): do_down; {|down|, |y|, or |z|}
fnt_cl: dvi_font;
fnt_def_cl: dvi_do_font(random_reading);
invalid_cl: goto done;
end; {there are no other cases}
dvi_first_par; {get the next command}
@ While skipping a page all commands other than font definitions are
ignored.
@<DVI: Skip a page; then |goto done|@>=
loop begin
case cur_class of
xxx_cl: while cur_parm>0 do
begin temp_byte:=dvi_ubyte; decr(cur_parm);
end;
fnt_def_cl: dvi_do_font(random_reading);
invalid_cl: goto done;
othercases do_nothing;
endcases;
dvi_first_par; {get the next command}
@ When a font is used for the first time, the |do_font| procedure is
called to decide whether this is a virtual font or not.
@<DVI: Typeset a |char|@>=
begin set_cur_wp; if cur_wp=invalid_width then bad_dvi;
if font_type(cur_fnt)=new_font_type then do_font; {|cur_fnt| was not yet used}
if font_type(cur_fnt)=vf_font_type then do_vf_packet @+ else do_char;
@ The |do_dvi| procedure reads the entire \.{DVI} file and initiates
whatever actions may be necessary.
@p procedure do_dvi;
var temp_byte:int_8u; {byte for temporary variables}
@!temp_int:int_32; {integer for temporary variables}
@!k:int_15; {general purpose variable}
begin @<DVI: Process the preamble@>;
if random_reading then @<DVI: Process the postamble@>;
repeat dvi_first_par;
while cur_class=fnt_def_cl do
begin dvi_do_font(random_reading); dvi_first_par;
end;
if cur_cmd=bop then do_page;
until cur_cmd<>eop;
if cur_cmd<>post then bad_dvi;
@ @<DVI: Process the preamble@>=
if dvi_ubyte<>pre then bad_dvi;
if dvi_ubyte<>dvi_id then bad_dvi;
dvi_num:=dvi_pquad; dvi_den:=dvi_pquad; dvi_mag:=dvi_pquad;
tfm_conv:=(25400000.0/dvi_num)*(dvi_den/473628672)/16.0;
temp_byte:=dvi_ubyte; pckt_room(temp_byte);
for k:=1 to temp_byte do append_byte(dvi_ubyte);
print('DVI file: '''); print_packet(new_packet); print_ln(''',');
print_ln(' num=',dvi_num:1,', den=',dvi_den:1,', mag=',
dvi_mag:1,'.');
do_pre; flush_packet
@ @<Glob...@>=
@!dvi_num:int_31; {numerator}
@!dvi_den:int_31; {denominator}
@!dvi_mag:int_31; {magnification}
@!dvi_pages:int_16u; {magnification}
@!dvi_back:int_32; {a back pointer}
@ @<DVI: Process the postamble@>=
begin dvi_back:=dvi_loc; {remember start of first page}
@<DVI: Find the postamble@>;
d_print_ln('DVI: postamble at ',dvi_loc-1:1);
temp_int:=dvi_pointer;
if dvi_num<>dvi_pquad then bad_dvi;
if dvi_den<>dvi_pquad then bad_dvi;
if dvi_mag<>dvi_pquad then bad_dvi;
temp_int:=dvi_squad; temp_int:=dvi_squad;
if stack_size<dvi_upair then overflow(str_stack,stack_size);
dvi_pages:=dvi_upair;
dvi_first_par;
while cur_class=fnt_def_cl do
begin dvi_do_font(false); dvi_first_par;
end;
if cur_cmd<>post_post then bad_dvi;
dvi_move(dvi_back); {back to start of first page}
@ @<DVI: Find the postamble@>=
temp_int:=dvi_length; if temp_int<53 then bad_dvi;
Decr(temp_int)(4);
repeat if temp_int=0 then bad_dvi;
dvi_move(temp_int); temp_byte:=dvi_ubyte; decr(temp_int);
until temp_byte<>223;
if temp_byte<>dvi_id then bad_dvi;
dvi_move(temp_int-4); if dvi_ubyte<>post_post then bad_dvi;
dvi_move(dvi_pquad); if dvi_ubyte<>post then bad_dvi
@* The main program.
The code for real devices is still rather incomplete.
Moreover several branches of the program have not been tested because
they are never used with \.{DVI} files made by \TeX\ and \.{VF} files
made by \.{VPtoVF}.
@ At the end of the program the output file(s) have to be finished and
on some systems it may be necessary to close input and\slash or output
files.
@^system dependencies@>
@p procedure close_files_and_terminate;
var k:@!int_15; {general purpose index}
begin @<Close input file(s)@>@;
@<Finish output file(s)@>@;
stat @<Print memory usage statistics@>;@+tats@;@/
@<Close output file(s)@>@;
@<Print the job |history|@>;
@ Now we are ready to put it all together.
Here is where \.{\title} starts, and where it ends.
@^system dependencies@>
@p begin initialize; {get all variables initialized}
@<Initialize predefined strings@>@;
@<Open input file(s)@>@;
@<Open output file(s)@>@;
do_dvi; {process the entire \.{DVI} file}
close_files_and_terminate;
final_end:end.
@ @<Print memory usage statistics@>=
print_ln('Memory usage statistics:');
print(dvi_nf:1,' dvi, ',lcl_nf:1,' local, ');
@<Print more font usage statistics@>@;@/
print_ln('and ',nf:1,' internal fonts of ',max_fonts:1);
print_ln(n_widths:1,' widths of ',max_widths:1,' and ',
n_packets:1,' char packets for ',
n_chars:1,' characters of ',max_chars:1);
print_ln(pckt_ptr:1,' byte packets of ',max_packets:1,' with ',
byte_ptr:1,' bytes of ',max_bytes:1);
@<Print more memory usage statistics@>@;@/
print_ln(stack_used:1,' of ',stack_size:1,' stack and ',
recur_used:1,' of ',max_recursion:1,' recursion levels.');
@ Some implementations may wish to pass the |history| value to the
operating system so that it can be used to govern whether or not other
programs are started. Here we simply report the history to the user.
@^system dependencies@>
@<Print the job |history|@>=
case history of
spotless: print_ln('(No errors were found.)');
harmless_message: print_ln('(Did you see the warning message above?)');
error_message: print_ln('(Pardon me, but I think I spotted something wrong.)');
fatal_message: print_ln('(That was a fatal error, my friend.)');
end {there are no other cases}
@* System-dependent changes.
This section should be replaced, if necessary, by changes to the program
that are necessary to make \.{DVIcopy} work at a particular installation.
It is usually best to design your change file so that all changes to
previous sections preserve the section numbering; then everybody's version
will be consistent with the printed program. More extensive changes,
which introduce new sections, can be inserted here; then only the index
itself will get a new section number.
@^system dependencies@>
@* Index.
Pointers to error messages appear here together with the section numbers
where each ident\-i\-fier is used.